Summary

Status

You can access the WebAPI which consumes the model @ predia.herokuapp.com.

Context

This repository contains the Machine Learning Model & the WebAPI of the PREDIA – Modelo Híbrido Multifatorial for my final paper @ Unisinos. All the work starts with the OneHotEncoding technique being applied to the dataset. After that, Exploratory Data Analysis, and more specific, Correlation Analysis were made to find the features that were deacreasing the models perfomance. Then, the model building starts with the selection of 3 heterogenous algorithms, where each one of them, makes a prediction following a pipeline composed of: Feature Engineering + Permutation Importance + Randomized Search & Feature Scaling (w/ MinMaxScaler). Once the pipeline is finished, the technique of Ensemble Learning called Aggregation is made, generating a final number of sales to be sold in the next day. The final model has a RMSE of 17.42 which represents 14% of the sales mean.

Techniques

OneHotEncoding: to optimize the algorithms prediction by transforming the dataset.
Exploratory Data Analysis: to check how my data is structured.
Correlational Feature Analaysis: to clean and remove the features which has no meaning.
Cross Validation: to use all the dataset instead of only one period.
Permutation Importance: to identify what are the most important feature for each algorithm.
MinMaxScaler: to scale the dataset from 0 to 1 so all the algorithms do not suffer from its deviation.
Randomized Search: to identify the best hyperparameters for each algorithm.
Ensemble Learning: to aggregate the results of each model into one to increase the perfomance.

Tooling

Python: as the main language.
Jupyter Notebook: as the IDE to develop the model.
SKLearn: as the ML library.
Keras: as the deep learning library.
Streamlit: as the framework to build the webapi.
Heroku: as the server to host the entire model & webapi.

WebAPI - Getting Started

pipenv shell
pip install streamlit
pip install plotly
pip install sklearn
pip install keras
pip install tensorflow
streamlit run app.py

Jupyter Notebook - Getting Started

cd predia
jupyter notebook

Heroku - Getting Started

heroku login
...
git push heroku master
heroku logs --tail

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
model		model
notebook		notebook
.slugignore		.slugignore
Procfile		Procfile
README.md		README.md
app.py		app.py
create_config.sh		create_config.sh
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summary

Status

Context

Techniques

Tooling

WebAPI - Getting Started

Jupyter Notebook - Getting Started

Heroku - Getting Started

Components Diagram

Architecture Basic Diagram

Architecture Detail Diagram

Sales History

Algorithms Prediction Combined

About

Releases

Packages

Languages

LuisValgoi/predia

Folders and files

Latest commit

History

Repository files navigation

Summary

Status

Context

Techniques

Tooling

WebAPI - Getting Started

Jupyter Notebook - Getting Started

Heroku - Getting Started

Components Diagram

Architecture Basic Diagram

Architecture Detail Diagram

Sales History

Algorithms Prediction Combined

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages