Template and utils code for machine learning project
- src/ all source code of this project
- tests/ unit/integration/smoke tests
- notebooks/ jupyter notebooks
- scripts/ python or shell scripts
- model_finished/ finished model for deployment
- model_training/ directory to store unfinished model
- logs/ logs
- data/ raw_data, debug_data
conda env create -f environment.yaml
conda activate revenue_model
# save conda environment setting
conda env export --no-builds > environment.yaml
# create jupyter kernel
python -m ipykernel install --user --name ml_project --display-name "Python3.8(ml_project)"
- Config for training is located in scripts/train_config.py
- Config for source code is located in src/config.py
# set env first, or the following code will not work
export PYTHONPATH=./:PYTHONPATH
# split data to train & test data; save data
python scripts/data_cvt.py
# train & save model & eval model
python scripts/model_train.py
# load model & eval model; to specify the model version, refer
python scripts/model_eval.py
# code test: add `-s` for more detailed output
pytest -s tests
MLflow
mlflow run . -e cla_data_cvt
- conda environment setup test
- re-organize src/utils
- LightGBM support
- deep learning support
- code refactor