The Prediction API is built with Django REST Framework. It is used to
- store image and taxonomic data for model training
- act as a gateway to the prediction model (by preprocessing posted images and postprocessing model response results)
- gather response metadata (e.g. example images)
- store data from prediction requests to be used for future analysis
- Architecture
- Endpoints
- Testing
- Admin
- Users
- Models
- Database Fixtures
- Prediction Model / Tensorflow Serving
- Running the project locally
- Deployment
Detailed documentation about the API endpoints can be found at https://biodex.ethz.ch/pred-api/docs/
TO FOLLOW
For convenience, you can import the Lepi.postman_collection.json into postman and test the endpoints there. https://www.postman.com/
The collection can be found in /backend
With a staff or superuser account, you are able to login to https://biodex.ethz.ch/pred-api/lepi-admin/ to manage users, api tokens and other database records
The site .../admin/ is the default location for the admin site in Django. For security measures, we have renamed it as mentioned above.
Prediction API and website users can only be created via the django admin panel as there is no sign up process. This is to control who is using the prediction api in their programs.
There is currently no way to register without knowing an admin.
things that are needed by the model for predictions
image normalization values
before the image is sent to the tensorflow model for prediction, the rgb values in the image are normalized, using the mean values for rgb mean and rgb standard deviation which was calculated for that model's training data.
species_key_map maps the species as numbered by the prediction model, to the species PKs in the database model classes will range from 0 to n, where n is the number species the model was trained on. Class 0 in one model may not be the same as class 0 in a different model.The individual model ids need to be mapped to permanent species id numbers in the db
encoded hierarchy maps
this maps how classes in each hierarchical level map to the classes in their parent level. This is used when summing up the probabilies from the species level up to the family level.
Detailed Steps: API Prediction Post Request:
- Api receives a Post request with a image data in field ‘image’
- Image is saved to
mediafiles
folder - Image file is opened (with Pillow)
- Check the identifier (version number) for the active tensorflow model
- Load the preprocessing parameters (mean and standard deviation for the RGB values of the dataset used to train the active model)
- Preprocess the image (i.e. normalize image using appropriate mean and std dev values & convert image obj to a list)
- Post the image to the active model’s endpoint in the Tensorflow Model Container
- Logits (raw results) from model are converted to probabilities using softmax
- keys for the specific model are mapped to the species id’s in the database
- Probabilities are summed for each level in the taxonomic hierarchy
- Results are sorted from highest taxonomic level (family) to lowest (species)
- Top 10 results are selected
- String names for top results are queried
- Example images for top results are queried
- Json Response is formed and returned
(reference https://www.tensorflow.org/tfx/serving/docker)
make sure that the model that you want to containerize is saved in the models folder in the model_serving directory. The model should be in the tf.saved_model format, with a numerical folder name (e.g. a datetimestamp 202004281030)
Go to model_serving folder and run:
bash make_docker_image.sh
this convenience script pulls the tensorflow serving base image, adds the selected model from the local models folder, and creates a new image tagged with latest & the model number (see Dockerfile for more details
docker run -p 8501:8501 [DOCKER_IMAGE_NAME]
#this exposes an endpoint that can be used for predictions
SERVER_URL = 'http://localhost:8501/v1/models/[DOCKER_IMAGE_NAME]:predict'
Emails are automatically sent when a user is invited or a reset password code has been requested. This is handled in the
background / asynchronously with redis and celery.
- Email: Is the effective html code being sent by email. We use Django templates with html, to be able to style the content as we want.
- EmailType: Defines the dynamic content based on what email type it is and the html extension. For example registration or password reset.
- DevEmail: Email addresses need to be registered as dev emails if you want to receive emails in development mode.
-
You will need to have Docker installed on your machine. https://docs.docker.com/get-docker/
-
After having cloned the repo, through the terminal, navigate inside the folder where the Dockerfile is located
-
Run
docker build -t lepi .
to create the Docker image on your machine. You can replace 'lepi' with any name you want -
Replace the image name of the backend container in 'docker-compose.yml', with the name you used to build the image in previous step
-
Run
docker-compose up -d
in your terminal, to start up all containers. This will also pull the needed Docker images such as Postgres, Redis etc. if you don't already have them on your machine. -
Run
docker exec -ti lepi_backend_1 bash
in your terminal. This is to log into the container, which is running the REST API. Make sure 'lepi_backend_1' is the name of the running container. Else replace it with the correct name or with the container ID. -
Run
python manage.py runserver 0:8000
and visit http://0.0.0.0:8000/ - you should get a 404 served by Django, means it is running! We just do not serve anything on/
. -
Run
python manage.py migrate
to create the tables in your database (this needs only to be done the first time or if you update the models) -
Run
python manage.py createsuperuser
to create a superuser. Visit http://0.0.0.0:8000/backend/lepi-admin and use the credentials to log in
-
To have emails being sent automatically for registration and password resetting, you will have to add a gmail address and password in /envs/dev.env
-
For emails to be sent out during development mode, you will also have to add your email through Django admin under DevEmails
-
If you would like to add Sentry, which can help monitoring and debugging, you will also have to add your Sentry DSN
The containers are run using podman which, as opposed to Docker, allows containers to be run by a nonroot process.
copy contents of podman/prod folder to the host server. save files in the project, or users' home, folder. Assuming in the you are in ./podman folder
scp -r ./prod
- transfer image, fixturefile & staticfile tar files to server
- transfer env variable files
- run docker-compose -f docker-compose.prod.yml -p biodex up --build
- run migrations to tables: migrate docker-compose -f docker-compose.prod.yml exec web python manage.py migrate
- create superuser for django
- load fixture files using bash script: docker-compose exec web sh load_fixtures.sh