First, install Apptainer
To apptainerize your code, there are two possible routes:
To convert the docker image (generated by following the instructions in apptainer/
) into an apptainer image, proceed as follows:
# spin-up a local docker image registry (if not already up)
docker run -d -p 5000:5000 --restart=always --name registry registry:2
# push docker image to registry, then pull as apptainer image (.sif)
bash apptainer-build-from-docker.sh
# run test script
bash apptainer-run.sh
Design your ApptainerFile
, then:
# build
bash apptainer-build.sh
# run test script
bash apptainer-run-native.sh
Once either Option 1 or 2 are working, execute a training session via:
# execute training
WKSPACE_DIR=$(dirname $(pwd))
apptainer run --nv -B $WKSPACE_DIR:$WKSPACE_DIR jax.sif \
"cd $WKSPACE_DIR && python jax-nn-train.py --save_plot"