Skip to content
This repository has been archived by the owner on Feb 26, 2024. It is now read-only.
/ jupyterhub-docker Public archive

Docker recipes for deploying a JupyterHub server

License

Notifications You must be signed in to change notification settings

pitt-crc/jupyterhub-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JupyterHub deployment in use at the University of Pittsburgh

This is a JupyterHub deployment based on Docker currently in use at Pitt Center for Research Computing (CRC).

This repo is originally forked from a similar deployment at the Université de Versailles

Features

Learn more

This deployment is described in depth in this blog post.

Customizations made to the original deployment to fit our JupyterHub needs at the CRC:

Jupyterlab container customizations

  • Starting by the configuration of jupyterlab container (Dockerfile), we used a recent docker image (jupyter/tensorflow-notebook:python-3.10) that includes Python3.10 and the commonly used packages in machine and deep learning.
  • Install linux packages that might be need for running the image (via apt install).
  • Install python packages that might be need for running the image including (nb_conda_kernels) which is used to manage kernels for JupyterLab/Jupyter Notebook.
  • Prepare a configuration file for jupyter named "jupyter_config.json" with following contents:
{
  "CondaKernelSpecManager": {
    "kernelspec_path": "--user"
  }
}

and place it in the same directory as the (Dockerfile).

  • Add a line in the (Dockerfile) to copy the configuration file to the container (COPY ./jupyter_config.json /home/jovyan/.jupyter/jupyter_config.json).
  • You can test this container directly through running: podman-compose build jupyterlab then podman run --rm -p 8888:8888 jupyterlab_img and visit the generated tokenized link (looks like: http://127.0.0.1:8888/?token=xyz) to test jupyterlab.

Jupyterhub container customizations

  • Starting by the configuration of jupyterhub container (Dockerfile), we used a recent docker image (jupyterhub/jupyterhub:4.0.2).

  • Install custom python plugins and dependencies via pip install.

  • Configure environment so containers are managed by podman instead of docker (requires adding a volume in docker-compose.yml).

      ENV DOCKER_HOST=unix:///var/run/docker.sock
    
  • Edit the jupyterhub_config.py to include the following:

    • Add the following lines to the top of the file to import the required modules:
      import os
      import sys
      
      c = get_config()
      
      ## Generic
      c.JupyterHub.admin_access = True
      c.JupyterHub.hub_ip = os.environ['HUB_IP']
    
    • Add authenticator configuration:
      c.JupyterHub.authenticator_class = 'crc_jupyter_auth.RemoteUserAuthenticator'
      c.Authenticator.required_vpn_role = 'SAM-SSLVPNSAMUsers'
      c.Authenticator.missing_user_redirect = 'https://crc.pitt.edu/Access-CRC-Web-Portals'
      c.Authenticator.missing_role_redirect = 'https://crc.pitt.edu/Access-CRC-Web-Portals'
    
    • Add the docker spawner configuration:
      ## Spawners
      c.Spawner.default_url = '/lab'
      c.Spawner.debug = True
      c.Spawner.cpu_limit = 1
      c.Spawner.mem_limit = '4G'
      
      # Docker spawner
      # Common convention is to mount mounting user home directories under /home/jovyan/work
      c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
      c.DockerSpawner.image = os.environ['DOCKER_JUPYTER_CONTAINER']
      c.DockerSpawner.notebook_dir = '/home/jovyan/work'
      c.DockerSpawner.volumes = {'jupyterhub-user-{username}': '/home/jovyan/work'}
      c.DockerSpawner.network_name = os.environ['DOCKER_NETWORK_NAME']
    
    • Add the following lines to the end of the file to configure the JupyterHub idle culler role and service:
      ## Services
      c.JupyterHub.load_roles = [
          {
              "name": "jupyterhub-idle-culler-role",
              "scopes": [
                  "list:users",
                  "read:users:activity",
                  "read:servers",
                  "delete:servers",
              ],
              "services": ["jupyterhub-idle-culler-service"],
          }
      ]
      
      c.JupyterHub.services = [
          {
              "name": "jupyterhub-idle-culler-service",
              "command": [sys.executable, "-m", "jupyterhub_idle_culler", "--timeout=3600" ],
          }
      ]
    

Customizations for the compose configuration

  • The main configuration file for Docker Compose is docker-compose.yml, which configures all containers (services, in Compose jargon), and associated volumes and networks.
  • Services:
    • JupyterHub configuration:
      • Mount the podman socket inside the container, so that it will be able to spawn containers for the single-user Jupyter servers.
      • Mount the JupyterHub configuration file inside the container, so that it will be able to read it.
      • Mount the JupyterHub data directory inside the container, so that it will be able to store its data persistently.
      • Set some environment variables for the Hub process, they will be used in the Hub configuration file jupyterhub_config.py:
        • DOCKER_JUPYTER_IMAGE is the name of the Docker image for the single-user servers; this must match the image configured in the jupyterlab section of docker-compose.yml (see below).
        • DOCKER_NETWORK_NAME is the name of the Docker network used by the services; normally, this network gets an automatic name from Docker Compose, but the Hub needs to know this name to connect the single-user servers to it.
        • DOCKER_NOTEBOOK_DIR is the directory inside the single-user server containers where the user's home directory will be mounted.
        • HUB_IP is the IP address of the Hub service within the docker network. By using the container_name Compose directive, we can set an alias for the IP, and use the same for HUB_IP.
      • Set the restart policy to always, so that the Hub container will be restarted automatically if it crashes.
      • Set the network name to the same name as the one used in the Hub configuration env variable above.
    • JupyterLab configuration:
      • Mount the JupyterLab data directory inside the container, so that it will be able to store its data persistently.
      • Set the build name to jupyterlab.
      • Set the image name to jupyterlab_img.
      • Set the container name to jupyterlab-throwaway.
      • Set the network name to the same name as the one used in the Hub configuration env variable above.
  • Volumes:
    • mount the jupyterhub_data volume to store the JupyterHub data persistently.
  • Networks:
    • Create a Docker network with the same name used above for the Hub and Lab services for the services to communicate with each other.

Run!

Once you are ready, build and launch the application from the jupyterhub_service account:

  1. First to switch from root to jupyterhub_service account:
  machinectl shell --uid=jupyterhub_service
  1. Then to build and launch the application:
  docker-compose build
  docker-compose up -d

Read the Docker Compose manual to learn how to manage your application.