Skip to content

Latest commit

 

History

History
97 lines (70 loc) · 4.32 KB

TESTPLAN.md

File metadata and controls

97 lines (70 loc) · 4.32 KB

Test Cases for E2E Demo.

  • Provisioning of GPU Node

    • MachineSet is created.

      • Name of the machine set -gpu-. This should not be a hard requirement though.

      • Machine set has taint section

        taints:
          - effect: NoSchedule
            key: odh-notebook  <--- Use own taint name or skip all together
            value: 'true'
      • MachineSet has a label

         metadata:
           labels:
             node-role.kubernetes.io/odh-notebook: '' <--- Put your label if needed
      • MachineSet instance type

         providerSpec:
           value:
           ........................
           instanceType: g5.2xlarge <---- Change vm type if needed
      • The nodes are provisioned with the proper label. The number of pods running should be greater than 20. labeled nodes

      • NVIDIA pods should be running on the nodes. Check the pods running on the GPU nodes. nvidia pods

    • Verify Node Feature Discovery Operator is installed:

      • Select Installed Operators from the left Navigation Bar and under Projects, select All Projects. Node Discover Feature Operator should be installed nfd operator

      • Click on the Node Feature Discovery Operator. Under NodeFeatureDiscovery an instance should be created. Status should be Available. nfd instance

    • Verify NVIDIA GPU Operator is installed.

      • NVIDIA GPU Operator is installed

        nvidia operator

      • Click on the NVIDIA GPU Operator and click on ClusterPolicy. A gpu-cluster-policy should exist nvidia clusterpolicies

      • Click on the gpu-cluster-policy and click on the YAML tab. The YAML should contain the tolerations

         tolerations:
           - effect: NoSchedule
             key: odh-notebook
             value: 'true'
  • Application provisioned correctly.

    • Click on the rag-llm namespace

      • By Default, EDB Operator will be deployed, which will deploy PGVECTOR vector database, 6 pods should be running ragllm pgvector pods

      • If the global.db.type is set to REDIS in the values-global.yaml, four pods should be running ragllm pods

      • Click on Networking → Routes from the left Navigation panel. An llm-ui route should exist llm-ui route

      • Click on the link under Location column and it should launch the application llm-ui application

      • Enter customer name as ‘IBM’ and for product enter ‘RedHat OpenShift’ and click Generate. A project proposal should be generated llm-ui project

      • Click on Ratings to rate the model.

  • Verify Grafana and Prometheus are installed correctly.

    • By default, Grafana application is deployed in llm-monitoring namespace.To launch the Grafana Dashboard, follow the instructions below:
      • Grab the credentials of Grafana Application

        • Navigate to Workloads --> Secrets
        • Click on the grafana-admin-credentials and copy the GF_SECURITY_ADMIN_USER, GF_SECURITY_ADMIN_PASSWORD
      • Launch Grafana Dashboard

        • Navigate to Networking --> Routes in the llm-monitoring namespace.
        • Click on the Location link for grafana-route.
        • Enter the Grafana admin credentials.
        • Ratings are displayed for each model
      • Grafana Dashboard is displayed

        llm-ui grafana