Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show GPU stats on the node utilization page #3576

Open
SilinPavel opened this issue Jun 27, 2024 · 1 comment
Open

Show GPU stats on the node utilization page #3576

SilinPavel opened this issue Jun 27, 2024 · 1 comment
Assignees
Labels
kind/enhancement New feature or request state/has-doc Issues that have documentation

Comments

@SilinPavel
Copy link
Member

Meterics

  • GPU Utilization
  • GPU Memory utilization

Dimension

  • Time
  • GPU ID (Model)

Node stats

Global metrics:

  • Time GPU active (%% of node run time when GPU utilization was > 0)
  • GPU utilization (Mean/Max/Min of all average GPUs utilization for the node run time)
  • GPU memory (same as GPU utilization)

Global chart:

  • GPU active (%% cards which have GPU utilization > 0)
  • GPU utilization
  • GPU memory

Heatmaps:

Same metrics as on the charts

  • One line per GPU ID
  • Each time point - gradient 0-100%
  • Shown only when explicitly requested

Details:

  • Details are shown for all metrics at a hovered timepoint
  • GPU utilization and GPU memory metrics as heatmaps for the selected time point (semi-heatmap)
  • Do not include GPU Active in the details
  • Memory - show absolute value with %%
@SilinPavel SilinPavel added the kind/enhancement New feature or request label Jun 27, 2024
NShaforostov added a commit that referenced this issue Sep 2, 2024
* (Issue #3619) 'Runs archiving' doc
* (Issue #3573) 'Container limits' doc
* (Issue #3568) 'Compose a Dockerfile' doc
* (Issue #3576) 'GPU statistics monitor' doc
* (Issue #3602) 'Pod network consumption alert and restriction' doc
@NShaforostov
Copy link
Collaborator

Docs added via #3669 and located here.

@NShaforostov NShaforostov added the state/has-doc Issues that have documentation label Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement New feature or request state/has-doc Issues that have documentation
Projects
None yet
Development

No branches or pull requests

3 participants