Skip to content

Commit

Permalink
docs: add doc page for enabling Intel GPUs for dss (#160)
Browse files Browse the repository at this point in the history
This commit creates a new how-to guide on the documentation that describes enabling the Intel GPU for DSS. It follows the steps described in the [intel-dss-plugins-for-kubernetes](https://github.com/intel/intel-device-plugins-for-kubernetes/blob/main/cmd/gpu_plugin/README.md) repo.

To avoid confusion between the guides for Intel and NVIDIA GPUs, the guide for enabling NVIDIA GPUs has been renamed to be more descriptive.

In the Jupyter Notebooks how-to guide that focus on NVIDIA workflows, the description has been updated to reflect that change.

Fixes #148
  • Loading branch information
mvlassis authored Aug 16, 2024
1 parent 7b52ae0 commit 55e4bcf
Show file tree
Hide file tree
Showing 7 changed files with 146 additions and 11 deletions.
8 changes: 8 additions & 0 deletions docs/.custom_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,26 @@ DaemonSet
DS
DSS
GeForce
gpu
GPUs
hostpath
Hostpath
HWE
Initialize
initialize
initialized
initializing
intel
io
ipynb
Jupyter
jupyter
JupyterLab
kubeconfig
kubectl
kubeflownotebookswg
kustomization
kustomize
runtime
runtimes
MacOS
Expand All @@ -28,6 +34,7 @@ microk8s.io
MLflow
mlruns
Multipass
NFD
Nvidia
OCI
ons
Expand All @@ -45,3 +52,4 @@ Tensorflow
toolkits
Validator
WSL
yaml
115 changes: 115 additions & 0 deletions docs/how-to/enable-gpus/enable-intel-gpu.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
.. _enable_intel_gpu:

Enable Intel GPUs
=================

This guide describes how to configure Data Science Stack (DSS) to utilise the Intel GPUs on your machine.

You can do so done by enabling the Intel device plugin on your `MicroK8s`_ cluster.

Prerequisites
-------------

* Ubuntu 22.04.
* DSS is :ref:`installed <install_DSS_CLI>` and :ref:`initialised <initialise_DSS>`.
* The `kubectl snap <https://snapcraft.io/kubectl>`_ package is installed.
* Your machine includes an Intel GPU.

Verify the Intel GPU drivers
----------------------------------------------------------

To confirm that your machine has the Intel GPU drivers set up, first install the `intel-gpu-tools` package:

.. code-block:: bash
sudo apt install intel-gpu-tools
Now list the Intel GPU devices on your machine as follows:

.. code-block:: bash
intel_gpu_top -L
If the drivers are correctly installed, you should see information about your GPU device such as the following:

.. code-block::
card0 8086:56a0
pci:vendor=8086,device=56A0,card=0
└─renderD128
.. note::
For Intel discrete GPUs on Ubuntu versions older than 24.04, you may need to perform additional steps such as installing a `HWE kernel <https://ubuntu.com/kernel/lifecycle>`_.

Enable the Intel GPU plugin
------------------------------------------------------

To ensure DSS can utilise Intel GPUs, you have to enable the Intel GPU plugin in your MicroK8s cluster.

1. Use `kubectl kustomize` to build the plugin YAML configuration files:

.. code-block:: bash
VERSION=v0.30.0
kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/nfd?ref=${VERSION} > node_feature_discovery.yaml
kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/nfd/overlays/node-feature-rules?ref=${VERSION} > node_feature_rules.yaml
kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/gpu_plugin/overlays/nfd_labeled_nodes?ref=${VERSION} > gpu_plugin.yaml
To allow multiple containers to utilise the same GPU, run:

.. code-block:: bash
sed -i 's/enable-monitoring/enable-monitoring\n - -shared-dev-num=10/' gpu_plugin.yaml
2. Apply the built YAML files to your MicroK8s cluster:

.. code-block:: bash
kubectl apply -f node_feature_discovery.yaml
kubectl apply -f node_feature_rules.yaml
kubectl apply -f gpu_plugin.yaml
The MicroK8s cluster is now configured to recognise and utilise your Intel GPU.

.. note::
After the YAML configuration files have been applied, they can be safely deleted.

Verify the Intel GPU plugin
-------------------------------------------------
To verify the Intel GPU plugin is installed and the MicroK8s cluster recognises your GPU, run:

.. code-block:: bash
kubectl get nodes --show-labels | grep intel
You should see an output with the cluster name such as the following:

.. code-block:: bash
kubectl get nodes --show-labels | grep intel
fluent-greenshank Ready <none> 18s v1.30.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,intel.feature.node.kubernetes.io/gpu=true
Verify DSS detects the GPU
----------------------------------

Verify DSS has detected the GPU by checking the DSS status. To do so, run the following command using the DSS CLI:

.. code-block:: bash
dss status
You should expect an output like this:

.. code-block:: bash
Output:
[INFO] MLflow deployment: Ready
[INFO] MLflow URL: http://10.152.183.68:5000
[INFO] NVIDIA GPU acceleration: Disabled
[INFO] Intel GPU acceleration: Enabled
See also
--------

* To learn how to manage your DSS environment, check :ref:`manage_DSS`.
* If you are interested in managing Jupyter Notebooks within your DSS environment, see :ref:`manage_notebooks`.
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
Enable NVIDIA GPUs
==================

This guide describes how to configure DSS to utilise your NVIDIA GPUs.
This guide describes how to configure Data Science Stack (DSS) to utilise your NVIDIA GPUs.

You can do so by configuring the underlying MicroK8s, on which DSS relies on for running the containerised workloads.
You can do so by configuring the underlying `MicroK8s`_, on which DSS relies on for running the containerised workloads.

Prerequisites
-------------

* :ref:`MicroK8s is installed <set_microk8s>`.
* :ref:`DSS CLI is installed <install_DSS_CLI>` and :ref:`initialised <initialise_DSS>`.
* DSS is :ref:`installed <install_DSS_CLI>` and :ref:`initialised <initialise_DSS>`.
* Your machine includes an NVIDIA GPU.

.. _install_nvidia_operator:

Expand Down
12 changes: 12 additions & 0 deletions docs/how-to/enable-gpus/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _enable_gpus:

Enable GPUs
=============

The following guides cover configuration aspects to leverage your GPUs within the Data Science Stack (DSS) environment.

.. toctree::
:maxdepth: 1

enable-intel-gpu
enable-nvidia-gpu
8 changes: 4 additions & 4 deletions docs/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,14 @@ Learn how to manage `MLflow <Charmed MLflow_>`_ within your DSS environment.
:maxdepth: 1

mlflow

Leverage NVIDIA GPUs
Enable GPUs
--------------------

Learn how to configure DSS to utilise your NVIDIA GPUs.
Learn how to configure DSS to leverage your GPUs.

.. toctree::
:maxdepth: 1

enable-gpu
enable-gpus/index

4 changes: 2 additions & 2 deletions docs/how-to/jupyter-notebook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ You should expect an output like this:
[INFO] Success: Notebook test-notebook created successfully.
[INFO] Access the notebook at http://10.152.183.42:80.
Create a GPU-enabled notebook
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create an NVIDIA GPU-enabled notebook
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can create a Jupyter Notebook containing CUDA runtimes and ML frameworks, and access its JupyterLab server.

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,4 +101,4 @@ Next Steps
* To learn more about how to interact with DSS, see :ref:`manage_DSS`.
* To learn about handling data, check out :ref:`access-data`.
* To connect to MLflow, see :ref:`manage_MLflow`.
* To enable your NVIDIA GPUs, check out :ref:`nvidia_gpu`.
* To leverage your GPUs, see :doc:`Enable GPUs <../how-to/enable-gpus/index>`.

0 comments on commit 55e4bcf

Please sign in to comment.