Skip to content

phalanx-hk/kaggle_pipeline

Repository files navigation

kaggle_pipeline

Overview

This is a template for a Kaggle pipeline for GPU instance. The following features are included for accelerating the development:

  • 📦 Container : Docker is used to create a container for the pipeline. To optimize training on NVIDIA GPUs, it is based on the PyTorch NGC Container.

  • 📦 devcontainer : By using devcontainers, it is possible to ensure reproducibility and develop without polluting the local environment.

  • 📥 Package installer : uv is used to speedup package installation. Now uv can't install the package without virtualenv, so I set VIRTUAL_ENV environment valirbale to /usr.

  • 📈 ML Experiment manager: wandb is used, but anything(e.g., MLflow and Comet) would be fine.

  • Code lint/format : ruff is used for both lint and format.

  • Type check : mypy

  • 📝 Test : pytest

Prerequirements

USAGE

install just

INSTALL_DIR=~/.local/bin
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to $INSTALL_DIR
export PATH="$PATH:$INSTALL_DIR"
# check the command which is used in development.
just --list

Build and run the container in detached mode

just devcontainer-up

Attach container to the vscode

Attach the container to the vscode in Docker extension. Docker extension -> CONTAINERS -> kaggle_pipeline.kaggle_pipeline-kaggle -> Attach Visual Studio Code

TODO

  • apply renovate bot
  • CI

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published