Skip to content

aanchan/Titanic-MTLDATA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic-MTLDATA

The goal of this IPython Notebook is to introduce some tools in Python for data-processing and machine learning. It is my no means exhaustive in any of the aspects of either Python, Machine Learning, Data-processing in Python or any of their permutations. The Machine Learning classifier used in this Notebook is one of the simplest classifiers called Logistic Regression models. The data-set on which the examples are run are taken from the Kaggle Titanic Challenge. This example was specifically chosen since there are many tutorials, IPython Notebooks, articles blogs and resources online on the Titanic Challenge that would help one get started.

To view this notebook online click on this this link. This IPython notebook above assumes some facility in working with Python.

Installing tools required for this Notebook to run on your system.

1. Python
2. IPython (Optional since you could run the Python commands from the IPython 
   	   notebook on your native Python interpreter)
3. Numpy
4. Scipy
5. Pandas
6. Scikit-Learn

Installation methods vary depending on the Operating System. Here is a great link on completing a setup in Python for scientific purposes.

Below are pointers to some resources that might help one get started off.

The Kaggle Titanic Challenge

Read about it here

Introduction to Logistic Regression

A Simple Explanation from Duke Medicine

Logistic Regression for Classification

Introduction to Python

Course from Coursera. This does not require one to download and install Python. They have a version for the course that runs off the browser interactively.

The best intro I think, from Python Docs

Introduction to the Numpy module in Python

The Tentative Numpy Tutorial is a good place to start.

Introduction to Scikit-Learn(referred to in this document also as SKLearn)

Here is a great introduction on Machine Learning with Scikit-Learn. Its a tutorial from PyCon 2014.

Introduction to Pandas

The Python Pandas Cookbook Lecture Series on Youtube by Alfred Essa is a good place to start. Specifically to load our Titanic data set Alfred Essa talks about it here in Lesson 1.2.

Other useful resources

A tutorial from Kaggle on Python

A tutorial from Kaggle on Pandas

A tutorial from Kaggle on SKLearn

An SKLearn Notebook

A Fancy Notebook showing off many aspects of the Titanic data problem

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages