This repository implements tools and analysis to measure NVIDIA GPUs utilization in serverless environments.
This methodology is only applicable for Linux environments. To utilize the Nvidia GPU, we need to install Nvidia Drivers and CUDA Toolkit (12.4). OpenFaaS, a popular severless framework, is utilized for deploying these functions within the choosen runtime environment. Minikube, a local Kubernetes implementation, is used to create a development environment that simulates a Kubernetes cluster. Within minikube, the necessary resources, including GPU support using Nvidia Container Toolkit are configured. Nvtop, a monitoring tool for GPU by Nvidia is utilized for evaluating the GPU.
These are the basic prerequisites to get started, for more information Nvidia CUDA installation Guide for Linux You must have a CUDA capable GPU
lspci | grep -i nvidia
Verify you have a supported version of Linux
uname -m && cat /etc/*release
You should see output similar to the following, modified for your particular system
x86_64
Red Hat Enterprise Linux Workstation release 6.0 (Santiago)
Verify the System Has gcc Installed
gcc --version
These steps should help you install Docker. For further information visit Docker
$ sudo apt-get update
$ sudo apt-get install ca-certificates curl
$ sudo install -m 0755 -d /etc/apt/keyrings
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
$ sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
$ echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Test Docker using 'Hello-World'
sudo docker run hello-world
If your GPU is compatible with 550 Drivers, you could execute the following commands. You could look up what driver is best for the GPU here
$ sudo apt-get update
$ sudo apt-get install -y linux-headers-$(uname -r) gcc make
$ wget https://us.download.nvidia.com/XFree86/Linux-x86_64/550.54.15/NVIDIA-Linux-x86_64-550.54.15.run
$ sudo chmod +x NVIDIA-Linux-x86_64-550.54.15.run
$ sudo ./NVIDIA-Linux-x86_64-550.54.15.run --silent --dkms
Verify you have access to the GPU using Nvidia Drivers
nvidia-smi
You should be able to see the Nvidia Driver version number and compatible CUDA Toolkit
The OS, Architecture and Distribution details are below, For other Distributions you can go here Operating System : Linux Architecture : x86_64 Distribution : Ubuntu Version : 22.04 Installer type : deb(local)
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
$ sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
$ wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
$ sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
$ sudo apt-get update
$ sudo apt-get -y install cuda-toolkit-12-4
$ sudo apt-get install -y cuda-drivers
$ sudo apt-get install -y nvidia-driver-550-open
$ sudo apt-get install -y cuda-drivers-550
Verify CUDA installation:
nvcc --version
You could install the container toolkit using Apt, Zypper or Yum. For Zypper and Yum, and if you have a multinode cluster and using Containerd you could see Nvidia Container Toolkit
Installing with Apt
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Optionally, configure the repository to use experimental packages:
$ sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
Update the packages list from the repository and Install the Nvidia Container Toolkit packages
$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
Configure the container runtime by using the nvidia-ctk command:
$ sudo nvidia-ctk runtime configure --runtime=docker
The nvidia-ctk
command modifies the /etc/docker/daemon.json
file on the host. The file is updated so that Docker can use the NVIDIA Container Runtime.
Restart the Docker daemon:
$ sudo systemctl restart docker
If you plan to evaluate GPU for a single node cluster use minikube, else you can use containerd
$ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
$ sudo install minikube-linux-amd64 /usr/local/bin/minikube
Now that you have configured container runtime you can start the Kubernetes cluster
Check if bpf_jit_harden
is set to 0
$ sudo sysctl net.core.bpf_jit_harden
If it's not 0
then run:
$ echo "net.core.bpf_jit_harden=0" | sudo tee -a /etc/sysctl.conf
$ sudo sysctl -p
Start Minikube:
$ minikube start --driver docker --container-runtime docker --gpus all
$ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
$ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
check whether you are able to execute the binary
$ echo "$(cat kubectl.sha256) kubectl" | sha256sum --check
You shoudl see a Output similar to the following:
kubectl:OK
Installing OpenFaaS via Arkade it pretty simple. If you face any issues installing OpenFaaS you can head over to OpenFaaS Documentation Get Arkade first
$ curl -SLsf https://get.arkade.dev/ | sudo sh
Make the kubectl binary executable
$ chmod +x kubectl
$ sudo mv kubectl /usr/local/bin
$ arkade install openfaas
Install the FaaS-CLI
$ curl -SLsf https://cli.openfaas.com | sudo sh
$ echo $(kubectl -n openfaas get secret basic-auth -o jsonpath="{.data.basic-auth-password}" | base64 --decode)
$ kubectl rollout status -n openfaas deploy/gateway
$ kubectl port-forward -n openfaas svc/gateway 8080:8080 &
Create your first Python function with OpenFaaS here
First pull the template repository
$ faas-cli template pull
Create a java function, Similarly you can replace the language and the function name with Dockerfile, Python, Go etc. and any suitable name that you would like for your function.
$ faas-cli new --lang python pythonFunction
This will create a directory with the function name and also functionname.yml
Build the Function. The built in URL is http://127.0.0.1
for OpenFaaS.
$ faas-cli build -f functionname.yml -g $GATEWAY_URL
if you need any help building the function, you can take the help of FaaS-CLI by typing in
$ faas-cli build --help
Edit the functionname.yml -- The defualt template will not gave any namespaces in it. Make sure to include namespaces
version: 1.0
provider:
name: openfaas
gateway: http://127.0.0.1:8080
functions:
cupyfunction:
lang: dockerfile
handler: ./functionname
image: <your_dockerhub_username>/functionname:latest
<namespaces: openfaas-fn>
Deploy the function This will deploy the function to your Docker hub repository
$ faas-cli up -f functionname.yml
You can enable prometheus for monitoring you cluster by using this command
$ faas-cli deploy -f functionname.yml --annotation prometheus.io.scrape=true --annotation prometheus.io.port=8081
Check the Prometheus Status will be changed to "True"
$ faas-cli describe functionname
Invoke the function
$ faas-cli invoke functionname
To ensure your container is running in the cluster. This will give all the containers running on the cluster. You should be able to see your function with your namespace
$ kubectl get pods -A
$ kubectl get services -A
$ sudo apt install nvtop
$ nvtop
If your function utilizes the GPU, then you should be able to see the spike in graph.