machine learning projects

This repo is just me practicing data clean ups and modelling different data types into various ml models

Steps taken: Create a anaconda environment for my machine learning adventures:

conda create -n ml_projects python=2.7

Install required packages:

pip install <package_name>

I and playing around with 2 different text editors

sublime: https://www.sublimetext.com/

atom: https://atom.io/

atom has some cool features and github integrations. Highly recommend (especially look into the hydrogen feature). Interesting atom+markdown+pandoc+jupyter notebook workflow:

https://www.reddit.com/r/Python/comments/8mqd40/convenient_and_easily_tweakable/?st=JHSQ9XF4&sh=d370eefa

Next, I type up my code and convert it to a jupyter notebook for a more interactive display. you can also directly start with a notebook but I prefer this conversion method:

git clone https://github.com/sklam/py2nb.git

Installation:

cd py2nb

python setup.py install

Usage:

python -m py2nb input.py output.ipynb

Basic ml

check out the notebook basic_ml.ipynb. This gives you a basic framework for splitting, visualizing and predicting. You can use this framework with different datsets and test which algorith works best for your dataset.

Anomaly detection

This technique works well when you know most of your dataset is negative (i.e. does not meet anomaly criteria) and only has a small subset of postive outcomes (i.e. meet anomaly criteria)

In the code I use a simple moving average approach to get trends and then mark anything 3 standard deviations away from the moving avergage as an anomaly

Very useful with time series data. I visualized using a simple plot funtion and a lag_plot within the panda library. The lag plot does a good job at detecting outliers as well.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
anomaly_detection.py		anomaly_detection.py
basic_ml.ipynb		basic_ml.ipynb
basic_ml.py		basic_ml.py
internet-traffic-data-in-bits-fr.txt		internet-traffic-data-in-bits-fr.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

machine learning projects

Basic ml

Anomaly detection

About

Releases

Packages

Languages

badamifs/ml_projects-

Folders and files

Latest commit

History

Repository files navigation

machine learning projects

Basic ml

Anomaly detection

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages