- Status
- Context
- Techniques
- Tooling
- WebAPI
- Notebook
- Heroku
- Components Diagram
- Architecture Basic Diagram
- Architecture Detail Diagram
You can access the WebAPI which consumes the model @ predia.herokuapp.com.
This repository contains the Machine Learning Model & the WebAPI of the PREDIA – Modelo Híbrido Multifatorial for my final paper @ Unisinos. All the work starts with the OneHotEncoding technique being applied to the dataset. After that, Exploratory Data Analysis, and more specific, Correlation Analysis were made to find the features that were deacreasing the models perfomance. Then, the model building starts with the selection of 3 heterogenous algorithms, where each one of them, makes a prediction following a pipeline composed of: Feature Engineering + Permutation Importance + Randomized Search & Feature Scaling (w/ MinMaxScaler). Once the pipeline is finished, the technique of Ensemble Learning called Aggregation is made, generating a final number of sales to be sold in the next day. The final model has a RMSE of 17.42 which represents 14% of the sales mean.
- OneHotEncoding: to optimize the algorithms prediction by transforming the dataset.
- Exploratory Data Analysis: to check how my data is structured.
- Correlational Feature Analaysis: to clean and remove the features which has no meaning.
- Cross Validation: to use all the dataset instead of only one period.
- Permutation Importance: to identify what are the most important feature for each algorithm.
- MinMaxScaler: to scale the dataset from 0 to 1 so all the algorithms do not suffer from its deviation.
- Randomized Search: to identify the best hyperparameters for each algorithm.
- Ensemble Learning: to aggregate the results of each model into one to increase the perfomance.
- Python: as the main language.
- Jupyter Notebook: as the IDE to develop the model.
- SKLearn: as the ML library.
- Keras: as the deep learning library.
- Streamlit: as the framework to build the webapi.
- Heroku: as the server to host the entire model & webapi.
pipenv shell
pip install streamlit
pip install plotly
pip install sklearn
pip install keras
pip install tensorflow
streamlit run app.py
cd predia
jupyter notebook
heroku login
...
git push heroku master
heroku logs --tail