Skip to content

LuisValgoi/predia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

Status

Heroku

You can access the WebAPI which consumes the model @ predia.herokuapp.com.

Context

This repository contains the Machine Learning Model & the WebAPI of the PREDIA – Modelo Híbrido Multifatorial for my final paper @ Unisinos. All the work starts with the OneHotEncoding technique being applied to the dataset. After that, Exploratory Data Analysis, and more specific, Correlation Analysis were made to find the features that were deacreasing the models perfomance. Then, the model building starts with the selection of 3 heterogenous algorithms, where each one of them, makes a prediction following a pipeline composed of: Feature Engineering + Permutation Importance + Randomized Search & Feature Scaling (w/ MinMaxScaler). Once the pipeline is finished, the technique of Ensemble Learning called Aggregation is made, generating a final number of sales to be sold in the next day. The final model has a RMSE of 17.42 which represents 14% of the sales mean.

Techniques

  • OneHotEncoding: to optimize the algorithms prediction by transforming the dataset.
  • Exploratory Data Analysis: to check how my data is structured.
  • Correlational Feature Analaysis: to clean and remove the features which has no meaning.
  • Cross Validation: to use all the dataset instead of only one period.
  • Permutation Importance: to identify what are the most important feature for each algorithm.
  • MinMaxScaler: to scale the dataset from 0 to 1 so all the algorithms do not suffer from its deviation.
  • Randomized Search: to identify the best hyperparameters for each algorithm.
  • Ensemble Learning: to aggregate the results of each model into one to increase the perfomance.

Tooling

  • Python: as the main language.
  • Jupyter Notebook: as the IDE to develop the model.
  • SKLearn: as the ML library.
  • Keras: as the deep learning library.
  • Streamlit: as the framework to build the webapi.
  • Heroku: as the server to host the entire model & webapi.

WebAPI - Getting Started

pipenv shell
pip install streamlit
pip install plotly
pip install sklearn
pip install keras
pip install tensorflow
streamlit run app.py

Jupyter Notebook - Getting Started

cd predia
jupyter notebook

Heroku - Getting Started

heroku login
...
git push heroku master
heroku logs --tail

Components Diagram

03_Components

Architecture Basic Diagram

03_Steps

Architecture Detail Diagram

04_Architecture_Detail

Sales History

image

Algorithms Prediction Combined

image

Releases

No releases published

Packages

No packages published

Languages