Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin/cuda: disable CUDA plugin if /dev/nvidiactl isn't present #2479

Merged
merged 1 commit into from
Sep 15, 2024

Conversation

avagin
Copy link
Member

@avagin avagin commented Sep 14, 2024

The presence of /dev/nvidiactl indicates that the system has a compatible NVIDIA GPU driver installed and that the GPU is accessible to the operating system.

@avagin
Copy link
Member Author

avagin commented Sep 14, 2024

@jesus-ramos is it the right assumption that nvidiactl should always exist to run cuda workloads? I am going to release v4.0 next week with the cuda plugin. Do you have anything that we need to merge into that release?

Copy link
Contributor

@jesus-ramos jesus-ramos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Yeah we expect /dev/nvidiactl to exist so that should be fine.

Nothing else to merge from me just yet. Looking forward to the release and thanks to you and Radostin on your help and work so far.

The presence of /dev/nvidiactl indicates that the system has a
compatible NVIDIA GPU driver installed and that the GPU is accessible to
the operating system.

Signed-off-by: Andrei Vagin <[email protected]>
@avagin avagin merged commit 6551847 into checkpoint-restore:criu-dev Sep 15, 2024
34 of 40 checks passed
@rst0git
Copy link
Member

rst0git commented Sep 16, 2024

Do you have anything that we need to merge into that release?

@avagin Currently we don't have a way of identifying if a checkpoint contains GPU state (e.g., if the CUDA plugin has been disabled during dump, it should be disabled during restore). It might be useful to add a boolean flag in the inventory image.

@avagin
Copy link
Member Author

avagin commented Sep 19, 2024

@rst0git it should be done in a generic way for all plugins. The first idea that comes to mind is that we need to have an array of required plugins in the inventory image. Plugins should be able to add new items into that arrays. Then on restore, we have to check that all required plugins are loaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants