You need to be able to work in a Jupyter Notebook on your computer. The following packages (libraries) need to be installed. You can install these packages via conda or pip.
- Pandas
- Matplotlib
- Numpy
- CSV
In this project, we have to go through the data analysis process and see how everything fits together. I have also use the Python libraries NumPy, pandas, and Matplotlib, which make writing data analysis code in Python a lot easier!
In this project, we have to analyze a dataset and then communicate our findings about it. We will use the Python libraries NumPy, pandas, and Matplotlib to make your analysis easier.
After completing the project, I have learned following :
- Know what all the steps involved in a typical data analysis process
- Be comfortable posing questions that can be answered with a given dataset and then answering those questions
- Know how to investigate problems in a dataset and wrangle the data into a format you can used
- Have practice communicating the results of your analysis
- Be able to use vectorized operations in NumPy and pandas to speed up your data analysis code
- Be familiar with Pandas Series and DataFrame objects, which let you access your data more conveniently
- Last but not least know how to use Matplotlib and Seaborn to produce plots showing findings.
Must give credit to Kaggle for the data. You can find the Licensing for the data and other descriptive information at the Udacity Webpage.