Skip to content

Latest commit

 

History

History
32 lines (19 loc) · 2.75 KB

cuda.md

File metadata and controls

32 lines (19 loc) · 2.75 KB

Building CUDA-enabled Images for DataHub/DSMLP

In this branch we will cover the starting steps of creating a GPU accelerated Docker Image for DSMLP. It's recommended to follow the steps in the "master" branch before continuing.

Note: our standard scipy-ml-notebook has CUDA installed with a compatible PyTorch and Tensorflow for use on DSMLP. Please use that image if that's all you require since installing a working CUDA environment is time consuming.

DSMLP: Available GPUs

As of Fall 2020, there are 15 GPU nodes on the DSMLP cluster available for classroom use, and each node has 8 NVIDIA GPUs installed. These GPUs are dynamically assigned to a container on start-up when requested and will stay attached until that container is deleted, meaning a GPU will remain occupied even if it's actually not running anything.

GPU Model VRAM Amount Node
NVIDIA 1080 Ti 11GB 80 n01 through n12 except n09, n10
NVIDIA 2080 Ti 11GB 32 n18, n21, n22, n24
NVIDIA 1070 Ti 8GB 7 n10

Install CUDA, PyTorch, and TensorFlow

Note: The Datahub/DSMLP cannot guarantee that versions of Python libraries that use particular CUDA versions will work on the platform as CUDA versions must be synced to specific Linux x86_64 Driver Versions

The graphics driver will be installed automatically on the container on start-up. It can be challenging to get CUDA compatibility working in a custom environment. You can find the latest CUDA Toolkit supported on DSMLP by checking how we install it within our scipy-ml-notebook container. The version of CUDA in this Dockerfile is the latest version of CUDA the DSMLP supports, and versions > what's defined may not work on the platform.

Please see the scipy-ml-notebook Dockerfile for supported TensorFlow and Pytorch versions.

Resources/Further Reading