This is a template for a Kaggle pipeline for GPU instance. The following features are included for accelerating the development:
-
📦 Container : Docker is used to create a container for the pipeline. To optimize training on NVIDIA GPUs, it is based on the PyTorch NGC Container.
-
📦 devcontainer : By using devcontainers, it is possible to ensure reproducibility and develop without polluting the local environment.
-
📥 Package installer : uv is used to speedup package installation. Now uv can't install the package without virtualenv, so I set
VIRTUAL_ENV
environment valirbale to/usr
. -
📈 ML Experiment manager: wandb is used, but anything(e.g., MLflow and Comet) would be fine.
-
✅ Code lint/format : ruff is used for both lint and format.
-
✅ Type check : mypy
-
📝 Test : pytest
- Docker >= 20.10.13 (for using composeV2)
- NVIDIA GPU Driver
- NVIDIA Container Toolkit
INSTALL_DIR=~/.local/bin
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to $INSTALL_DIR
export PATH="$PATH:$INSTALL_DIR"
# check the command which is used in development.
just --list
just devcontainer-up
Attach the container to the vscode in Docker extension.
Docker extension
-> CONTAINERS
-> kaggle_pipeline.kaggle_pipeline-kaggle
-> Attach Visual Studio Code
- apply renovate bot
- CI