Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 1.01 KB

README.md

File metadata and controls

21 lines (16 loc) · 1.01 KB

Introduction to Basic Operations in Data Science

In this task, we will be given a dataset (Titanic Survivers) and your job is to analyse the data given using relevant functions in Pandas, Numpy and Matplotlib. Tasks to Perform on this Dataset

A few things that I want you to do is -

Download the dataset and load it using Pandas. Remove all the NULL values from the dataset. Remove 'Name' and 'PassengerID' column Divide the dataset in 80:20 ratio using .loc and .iloc ONLY Plot a histogram for 'Fare','Age' column Plot bar chart for all binary columns (like 'Survived')

Explore more commands and features on your own, hopefully this will give you a good start! References

Dataset - https://www.kaggle.com/c/titanic/ Pandas, Numpy cheatsheet - http://www.cheat-sheets.org/saved-copy/NumPy_SciPy_Pandas_Quandl_Cheat_Sheet.pdf Pandas Tutorial - https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python Data Analysis with Python - https://www.youtube.com/watch?v=r-uOLxNrNk8