diff --git a/docs/.custom_wordlist.txt b/docs/.custom_wordlist.txt index 239f808..3ee7cbd 100644 --- a/docs/.custom_wordlist.txt +++ b/docs/.custom_wordlist.txt @@ -4,20 +4,26 @@ DaemonSet DS DSS GeForce +gpu GPUs hostpath Hostpath +HWE Initialize initialize initialized initializing +intel io ipynb Jupyter jupyter JupyterLab kubeconfig +kubectl kubeflownotebookswg +kustomization +kustomize runtime runtimes MacOS @@ -28,6 +34,7 @@ microk8s.io MLflow mlruns Multipass +NFD Nvidia OCI ons @@ -45,3 +52,4 @@ Tensorflow toolkits Validator WSL +yaml diff --git a/docs/how-to/enable-gpus/enable-intel-gpu.rst b/docs/how-to/enable-gpus/enable-intel-gpu.rst new file mode 100644 index 0000000..e5aff35 --- /dev/null +++ b/docs/how-to/enable-gpus/enable-intel-gpu.rst @@ -0,0 +1,115 @@ +.. _enable_intel_gpu: + +Enable Intel GPUs +================= + +This guide describes how to configure Data Science Stack (DSS) to utilise the Intel GPUs on your machine. + +You can do so done by enabling the Intel device plugin on your `MicroK8s`_ cluster. + +Prerequisites +------------- + +* Ubuntu 22.04. +* DSS is :ref:`installed ` and :ref:`initialised `. +* The `kubectl snap `_ package is installed. +* Your machine includes an Intel GPU. + +Verify the Intel GPU drivers +---------------------------------------------------------- + +To confirm that your machine has the Intel GPU drivers set up, first install the `intel-gpu-tools` package: + +.. code-block:: bash + + sudo apt install intel-gpu-tools + +Now list the Intel GPU devices on your machine as follows: + +.. code-block:: bash + + intel_gpu_top -L + +If the drivers are correctly installed, you should see information about your GPU device such as the following: + +.. code-block:: + + card0 8086:56a0 + pci:vendor=8086,device=56A0,card=0 + └─renderD128 + +.. note:: + For Intel discrete GPUs on Ubuntu versions older than 24.04, you may need to perform additional steps such as installing a `HWE kernel `_. + +Enable the Intel GPU plugin +------------------------------------------------------ + +To ensure DSS can utilise Intel GPUs, you have to enable the Intel GPU plugin in your MicroK8s cluster. + +1. Use `kubectl kustomize` to build the plugin YAML configuration files: + +.. code-block:: bash + + VERSION=v0.30.0 + kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/nfd?ref=${VERSION} > node_feature_discovery.yaml + kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/nfd/overlays/node-feature-rules?ref=${VERSION} > node_feature_rules.yaml + kubectl kustomize https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/gpu_plugin/overlays/nfd_labeled_nodes?ref=${VERSION} > gpu_plugin.yaml + +To allow multiple containers to utilise the same GPU, run: + +.. code-block:: bash + + sed -i 's/enable-monitoring/enable-monitoring\n - -shared-dev-num=10/' gpu_plugin.yaml + +2. Apply the built YAML files to your MicroK8s cluster: + +.. code-block:: bash + + kubectl apply -f node_feature_discovery.yaml + kubectl apply -f node_feature_rules.yaml + kubectl apply -f gpu_plugin.yaml + +The MicroK8s cluster is now configured to recognise and utilise your Intel GPU. + +.. note:: + After the YAML configuration files have been applied, they can be safely deleted. + +Verify the Intel GPU plugin +------------------------------------------------- +To verify the Intel GPU plugin is installed and the MicroK8s cluster recognises your GPU, run: + +.. code-block:: bash + + kubectl get nodes --show-labels | grep intel + +You should see an output with the cluster name such as the following: + +.. code-block:: bash + + kubectl get nodes --show-labels | grep intel + fluent-greenshank Ready 18s v1.30.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,intel.feature.node.kubernetes.io/gpu=true + +Verify DSS detects the GPU +---------------------------------- + +Verify DSS has detected the GPU by checking the DSS status. To do so, run the following command using the DSS CLI: + +.. code-block:: bash + + dss status + +You should expect an output like this: + +.. code-block:: bash + + Output: + [INFO] MLflow deployment: Ready + [INFO] MLflow URL: http://10.152.183.68:5000 + [INFO] NVIDIA GPU acceleration: Disabled + [INFO] Intel GPU acceleration: Enabled + +See also +-------- + +* To learn how to manage your DSS environment, check :ref:`manage_DSS`. +* If you are interested in managing Jupyter Notebooks within your DSS environment, see :ref:`manage_notebooks`. diff --git a/docs/how-to/enable-gpu.rst b/docs/how-to/enable-gpus/enable-nvidia-gpu.rst similarity index 92% rename from docs/how-to/enable-gpu.rst rename to docs/how-to/enable-gpus/enable-nvidia-gpu.rst index fdbae9a..37c5c94 100644 --- a/docs/how-to/enable-gpu.rst +++ b/docs/how-to/enable-gpus/enable-nvidia-gpu.rst @@ -3,15 +3,15 @@ Enable NVIDIA GPUs ================== -This guide describes how to configure DSS to utilise your NVIDIA GPUs. +This guide describes how to configure Data Science Stack (DSS) to utilise your NVIDIA GPUs. -You can do so by configuring the underlying MicroK8s, on which DSS relies on for running the containerised workloads. +You can do so by configuring the underlying `MicroK8s`_, on which DSS relies on for running the containerised workloads. Prerequisites ------------- -* :ref:`MicroK8s is installed `. -* :ref:`DSS CLI is installed ` and :ref:`initialised `. +* DSS is :ref:`installed ` and :ref:`initialised `. +* Your machine includes an NVIDIA GPU. .. _install_nvidia_operator: diff --git a/docs/how-to/enable-gpus/index.rst b/docs/how-to/enable-gpus/index.rst new file mode 100644 index 0000000..fa9666e --- /dev/null +++ b/docs/how-to/enable-gpus/index.rst @@ -0,0 +1,12 @@ +.. _enable_gpus: + +Enable GPUs +============= + +The following guides cover configuration aspects to leverage your GPUs within the Data Science Stack (DSS) environment. + +.. toctree:: + :maxdepth: 1 + + enable-intel-gpu + enable-nvidia-gpu diff --git a/docs/how-to/index.rst b/docs/how-to/index.rst index 6c618a3..fdebf0f 100644 --- a/docs/how-to/index.rst +++ b/docs/how-to/index.rst @@ -33,14 +33,14 @@ Learn how to manage `MLflow `_ within your DSS environment. :maxdepth: 1 mlflow - -Leverage NVIDIA GPUs + +Enable GPUs -------------------- -Learn how to configure DSS to utilise your NVIDIA GPUs. +Learn how to configure DSS to leverage your GPUs. .. toctree:: :maxdepth: 1 - enable-gpu + enable-gpus/index diff --git a/docs/how-to/jupyter-notebook.rst b/docs/how-to/jupyter-notebook.rst index 7cc3070..a16b9e9 100644 --- a/docs/how-to/jupyter-notebook.rst +++ b/docs/how-to/jupyter-notebook.rst @@ -44,8 +44,8 @@ You should expect an output like this: [INFO] Success: Notebook test-notebook created successfully. [INFO] Access the notebook at http://10.152.183.42:80. -Create a GPU-enabled notebook -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Create an NVIDIA GPU-enabled notebook +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can create a Jupyter Notebook containing CUDA runtimes and ML frameworks, and access its JupyterLab server. diff --git a/docs/tutorial/getting-started.rst b/docs/tutorial/getting-started.rst index 5d3608a..8e099b9 100644 --- a/docs/tutorial/getting-started.rst +++ b/docs/tutorial/getting-started.rst @@ -101,4 +101,4 @@ Next Steps * To learn more about how to interact with DSS, see :ref:`manage_DSS`. * To learn about handling data, check out :ref:`access-data`. * To connect to MLflow, see :ref:`manage_MLflow`. -* To enable your NVIDIA GPUs, check out :ref:`nvidia_gpu`. +* To leverage your GPUs, see :doc:`Enable GPUs <../how-to/enable-gpus/index>`.