You can either build off of this repository template or use it as reference to build your model from scratch. We have provided a sample model template for both R and Python.
- Python or R
- Docker
- Synapse account
- Synapse project for the challenge
-
Replace the code in the
run_model.*
script with your own algorithm(s). You can create additional scripts for modularization/better organization if desired. -
If using Python, update
requirements.txt
with any additional libraries/packages used by your script(s).If using R, update
requirements.R
and add/remove any libraries/packages listed inpkg_list
that are used by your script(s). -
(optional) Locally run
run_model.*
to ensure it can run successfully.These scripts have been written so that the input and output files are not hard-coded in the
/input
and/output
directories, respectively (though they are used by default). This way, you can test your changes using any directories as input and/or output.For example, the following indicates that the input files are in
sample_data/
, while the output file should be written to the current working directory (.
):Python
python run_model.py --input-dir ../sample_data/ --output-dir .
R
Rscript run_model.R --input-dir ../sample_data/ --output-dir .
-
Again, make sure that all needed libraries/packages are specified in the
requirements.*
file. Because all Docker submissions are run without network access, you will not able to install anything during the container run. If you do not want to use arequirements.*
file, you may run replace the RUN command with the following:Python
RUN pip install pandas
R
RUN R -e "install.packages(c('optparse'), repos = 'http://cran.us.r-project.org')"
-
COPY
over any additional files required by your model. We recommend using oneCOPY
command per file, as this can help speed up build time. -
Feel free to update the base image and/or tag version if the provided base image do not fulfill your needs. Although you can use any valid image as the base, we recommend using one of the Trusted Content images, especially if you are new to Docker. Images to consider:
- ubuntu
- python
- bitnami/pytorch
- r-base
- rocker/tidyverse
-
If your image takes some time to build, look at the order of your Dockerfile commands -- the order matters. To best take advantage of Docker's build-caching (that is, reusing previously built layers), it's often a good idea to put frequently-changing parts (such as
run_model.*
) near the end of the Dockerfile. The way build-caching works is that once a step needs to be rebuilt, all of the subsequent steps will also be rebuilt.
-
Assuming you are either in
r/
orpython/
, Dockerize your model:docker build -t docker.synapse.org/PROJECT_ID/my-model:v1 .
where:
PROJECT_ID
: Synapse ID of your projectmy-model
: name of your modelv1
: version of your model.
: filepath to the Dockerfile
Update the model name and/or tag name as desired.
Important
The submission system uses the x86-64 cpu architecture. If your machine uses a different architecture, e.g. Apple Silicon, you will need to additionally include --platform linux/amd64
into the command, e.g.
docker build -t IMAGE_NAME --platform linux/amd64 FILEPATH_TO_DOCKERFILE
-
(optional but highly recommended) Locally run a container to ensure the model can run successfully:
docker run \ --rm \ --network none \ --volume $PWD/sample_data:/input:ro \ --volume $PWD/output:/output:rw \ docker.synapse.org/PROJECT_ID/my-model:v1
where:
--rm
: stops and removes the container once it is done running--network none
: disables all network connections to the container (emulating the same behavior seen in the submission queues)--volume ...
: mounts data generated by and used by the container. For example,--volume $PWD/sample_data:/input:ro
will mount$PWD/sample_data
(from your machine) as/input
(in the container) with read-only permissions.docker.synapse.org/PROJECT_ID/my-model:v1
: Docker image and tag version to run
If your model requires a GPU, be sure to expose it by adding
--runtime nvidia
or--gpus all
to thedocker run
command. Note that your local machine will also need the NVIDIA Container Toolkit.
-
If you haven't already, log into the Synapse Docker registry with your Synapse credentials. We highly recommend you use a Synapse Personal Access Token (PAT) for this step. Once logged in, you should not have to log in again, unless you log out or switch Docker registries.
docker login docker.synapse.org --username SYNAPSE_USERNAME
When prompted for a password, enter your PAT.
You can also log in non-interactively through
STDIN
- this will prevent your password from being saved in the shell's history and log files. For example, if you saved your PAT into a file calledsynapse.token
:cat ~/synapse.token | \ docker login docker.synapse.org --username SYNAPSE_USERNAME --password-stdin
-
Use
docker push
to push the model up to your project on Synapse.docker push docker.synapse.org/PROJECT_ID/my-model:v1
The Docker image should now be available in the Docker tab of your Synapse project.