Skip to content

pmhaddad/datacamp_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repository contains the projects I am developing for classes I'm taking at the DataCamp website.
Projects are divided by language: Python, R and SQL.

Python projects

1. Exploring 67 years of LEGO

The Rebrickable database includes data on every LEGO set that has ever been sold; the names of the sets, what bricks they contain, what color the bricks are, etc. It might be small bricks, but this is big data! In this project, you will get to explore the Rebrickable database. To do this you need to know your way around pandas dataframes and it's recommended that you take a look at the courses pandas Foundations and Manipulating DataFrames with pandas.

Click here to access a rendered version of the Jupyter notebook


2. Dr. Semmelweis and the Discovery of Handwashing

In 1847, the Hungarian physician Ignaz Semmelweis makes a breakthough discovery: He discovers handwashing. Contaminated hands was a major cause of childbed fever and by enforcing handwashing at his hospital he saved hundreds of lives.
In this python project we will reanalyze the medical data Semmelweis collected. This project assumes that you are familiar with python and pandas DataFrames. You can learn the required skills in these courses: Intermediate Python for Data Science and pandas Foundations.

Click here to access a rendered version of the Jupyter notebook


3. Exploring the Bitcoin Cryptocurrency Market

To better understand the growth and impact of Bitcoin and other cryptocurrencies you will, in this project, explore the market capitalization of different cryptocurrencies.
Warning: The cryptocurrency market is exceptionally volatile, and any money you put in might disappear into thin air. Never invest money you can't afford to lose.
To complete this project, you need to be fluent with pandas DataFrames. Before starting this project, we recommend that you have completed the following courses: pandas Foundations, Manipulating DataFrames with pandas and Cleaning Data in Python

Click here to access a rendered version of the Jupyter notebook


4. Exploring the Evolution of Linux

Version control repositories like CVS, Subversion or Git store rich evolution information about a software project. In this project, you'll be challenged to read in, clean up and visualize a real world Git repository dataset of the Linux kernel. With almost 700k commits and thousands of contributors (find out the exact number in this project ;-) ) there are some little data cleaning and wrangling challenges that you'll encounter. But you'll also gain insights about the development activities over the last 13 years. For this Project, you need to be familiar with Pandas DataFrames, especially the read_csv and groupby functions, as well as working with time series data.

Click here to access a rendered version of the Jupyter notebook


R projects

1. Phyllotaxis: Draw Flowers Using Mathematics

R is a tool for doing serious statistics and data analysis. But not everything in life can be serious, life is also beautiful, and R can make beautiful things too. R can make art.
The arrangement of leaves on a plant stem is ruled by spirals. This fact is called phyllotaxis and it is a nice example of how mathematics can describe patterns in nature. In this project, we will invent flowers using this fact.
This R project assumes you have familiarity with the ggplot2 package. If you don't know ggplot2 we recommend you take either of the courses Introduction to the Tidyverse or Data Visualization with ggplot2 (Part 1). If you want to see more examples of how you can use R to make art, you should check out the Fronkonstin blog created by Antonio Sánchez Chinchón.

Click here to access a rendered version of the Jupyter notebook


2. Exploring the Kaggle Data Science Survey

When beginning a career in data science, one often wonders what programming tools and languages are being used in the industry, and what skills one should learn first. By exploring the 2017 Kaggle Data Science Survey results, you can learn about the tools used by 10,000+ people in the professional data science community. Before starting this project, you should be comfortable manipulating data frames and have some experience working with the tidyverse packages dplyr, tidyr, and ggplot2. This project uses a subset of the 2017 Kaggle Machine Learning and Data Science Survey dataset. If you want to know more about the tools and techniques Kaggle participants use, check out the full report of the Kaggle 2017 survey results.

Click here to access a rendered version of the Jupyter notebook


SQL projects

1. Analyze International Debt Statistics

It's not that we humans only take debts to manage our necessities. A country may also take debt to manage its economy. For example, infrastructure spending is one costly ingredient required for a country's citizens to lead comfortable lives. The World Bank is the organization that provides debt to countries. In this project, you are going to analyze international debt data collected by The World Bank. The dataset contains information about the amount of debt (in USD) owed by developing countries across several categories. You are going to find the answers to questions like: - What is the total amount of debt that is owed by the countries listed in the dataset? - Which country owns the maximum amount of debt and what does that amount look like? - What is the average amount of debt owed by countries across different debt indicators? The data used in this project is provided by The World Bank. It contains both national and regional debt statistics for several countries across the globe as recorded from 1970 to 2015.

Click here to access a rendered version of the Jupyter notebook


2. What and Where Are the World's Oldest Businesses?

An important part of business is planning for the future and ensuring that the company survives changing market conditions. Some businesses do this really well and last for hundreds of years. In this project, you'll explore data from BusinessFinancing.co.uk on the world's oldest businesses: when they were founded and which industries they belong to.

Click here to access a rendered version of the Jupyter notebook

About

My DataCamp projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published