Titanic-ML-Competition

Titanic Competition is one the most popular data science challenge on Kaggle platform:

https://www.kaggle.com/c/titanic The goal of the challenge is simple: predict which people survived the Titanic disaster.

Of course, the Titanic dataset is an open dataset which you can reach on many repositories and websites. However, I used dataset from Kaggle. Also, Kaggle easily measure the accuracy of your predictions.

In this competition, you obtain two similar datasets: train.csv and test.csv.

train.csv contains the details of a subset of the 891 passengers on board and importantly, will reveal whether they survived or not.
test.csv dataset contains similar data for another 418 passengers but does not have survival information - it’s your job to predict these outcomes.

The training set is used to build a machine learning model, while test dataset is used to see how high is the accuracy of created model. Kaggle also includes gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

Variables in dataset with definition:

PassengerId - unique ID
Survived - Survival -> 0 = No, 1 = Yes
Pclass - Ticket Class -> 1 = 1st, 2 = 2nd, 3 = 3rd
Name - Name and Title of a passenger
Sex - Sex of a Passenger
Age - Age of a Passenger
SibSp - Number of Passenger's Siblings/Spouses aboard
Parch - Number of Passenger's Parents/Children aboard
Ticket - Tictek number
Fare - Passenger fare
Cabin - Cabin number
Embarked - Port of Embarkation -> C = Cherbourg, Q = Queenstown, S = Southampton

Workflow of a project

My work is divided into 3 parts:

Part 1 - Data Cleaning
Part 2 - EDA
Part 3 - Decision Tree Model

R (RStudio)

R version 4.1.1 (2021-08-10)

R packages:

ggplot2
dplyr
gridExtra
rpart
rpart.plot

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
plots		plots
PredSubmit.csv		PredSubmit.csv
README.md		README.md
Titanic_Part1_DataCleaning.R		Titanic_Part1_DataCleaning.R
Titanic_Part2_EDA.R		Titanic_Part2_EDA.R
Titanic_Part3_DecisionTree.R		Titanic_Part3_DecisionTree.R
gender_submission.csv		gender_submission.csv
merged.csv		merged.csv
merged_fam.csv		merged_fam.csv
test.csv		test.csv
testPlot.png		testPlot.png
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic-ML-Competition

Workflow of a project

R (RStudio)

About

Releases

Packages

Languages

hhnks/Titanic-ML-Competition

Folders and files

Latest commit

History

Repository files navigation

Titanic-ML-Competition

Workflow of a project

R (RStudio)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages