machine_learning_notes

As part of Machine Learning Study Group from recworks meet-a-mentor community work on kaggle ML project.

We use Jupyter Notebook, Python and some libraries (Pandas, NumPy, Matplotlib, and Scikit-learn) to solve ML problems on public datasets. We have started with the following dataset from Kaggle:

https://www.kaggle.com/c/house-prices-advanced-regression-techniques

This project is the result of the joint effort of a study group of enthusiastic people from different professional backgrounds. Anyone is welcome to join and to contribute with the discussions.

The best way to join the project is by assisting during our regular sessions. Also, feel free to fork and check what we have done in this project. We are very happy to receive any comment and suggestions to improve.

There are two notebooks contained in the repository:

1) machine_learning_notes.ipynb:

This notebook contains:

1 - Define the problem
2 - Load data and displaying info
3 - Prepare Data
- [Identify features]
  - Separate numerical from categorical features
  - Separate nominal and ordinal (from categorical features)
- [Clean data]
  - Remove numerical features with missing values
  - Remove categorical features with missing values
  - drop outliers in numerical values # WIP
- [transform]
  - transform categorical values #TODO
3 - Feature selection
- [Select features using random forest classifier]
5 - Spot Check Algorithms
- [split dataset]
- [train on multiple algorithms]

2) plot_outlier_detection.ipynb

This notebook contains a reference for some techniques to tackle the problem to detect and remove outliers sample from the dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
.gitignore		.gitignore
README.md		README.md
machine_learning_notes.ipynb		machine_learning_notes.ipynb
plot_outlier_detection.ipynb		plot_outlier_detection.ipynb
selectfile.py		selectfile.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

machine_learning_notes

1) machine_learning_notes.ipynb:

2) plot_outlier_detection.ipynb

About

Releases

Packages

Contributors 3

Languages

masaguaro/machine_learning_notes

Folders and files

Latest commit

History

Repository files navigation

machine_learning_notes

1) machine_learning_notes.ipynb:

2) plot_outlier_detection.ipynb

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages