Check out Lesson 3 on Medium to better understand how we built the batch prediction pipeline.
Also, check out Lesson 5 to learn how we implemented the monitoring layer to compute the model's real-time performance.
The batch prediction pipeline uses the training pipeline module as a dependency. Thus, as a first step, we must ensure that the training pipeline module is published to our private PyPi server.
NOTE: Make sure that your private PyPi server is running. Check the Usage section if it isn't.
Build & publish the training-pipeline
to your private PyPi server:
cd training-pipeline
poetry build
poetry publish -r my-pypi
cd ..
Install the virtual environment for batch-prediction-pipeline
:
cd batch-prediction-pipeline
poetry shell
poetry install
Check the Set Up Additional Tools and Usage sections to see how to set up the additional tools and credentials you need to run this project.
To start batch prediction script, run:
python -m batch_prediction_pipeline.batch
To compute the monitoring metrics based, run the following:
python -m batch_prediction_pipeline.monitoring
NOTE: Be careful to complete the .env
file and set the ML_PIPELINE_ROOT_DIR
variable as explained in the Set Up the ML_PIPELINE_ROOT_DIR Variable section of the main README.