You can download the dataset and its documentation on Kaggle
python -m venv dev
source dev/Scripts/activate
pip install -r requirements.txt
docker build --tag app:1.0 .
- Download the RAW data ;
- Execute
src/clean.py
to createcleaned_data.csv
; - Execute
src/prepare_features.py
to createtraining.pkl
; - Execute
src/create_folds.py
to createtraining_folds.pkl
; - Execute
src/tune_hyper_parameters.py
to get optimal parameters ; - Execute
src/best.py
to train the model ;
python src/report.py --fold=1
fold value is in range [0,4]
python -m isort src/
python -m black src/
python -m flake8 src/ --count --statistics
This project is provided under the MIT license.