Fix get_device_name for cuda platforms that return bytes #12636

mgoin · 2025-02-01T02:49:11Z

A more general fix for the proposed changes in #12565 and #12635

Signed-off-by: mgoin <[email protected]>

github-actions · 2025-02-01T02:49:22Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

hmellor

Thanks for the more general fix!

DarkLight1337 · 2025-02-01T11:22:41Z

@youkaichao can you take a look at this? As per #12635 (comment), I think it's better to tell users to upgrade their pynvml version instead of silently casting the type.

hmellor · 2025-02-01T12:40:41Z

We could do both? We fix it because we can, and tell the user that they should upgrade pynvml to prevent the warning.

DarkLight1337 · 2025-02-01T12:43:53Z

I'm not sure whether there are other implications besides just the type issue when outdated pynvml is used. @youkaichao should know more.

youkaichao · 2025-02-01T16:02:17Z

the outdated pynvml package in general does not work, and I think we can be more aggressive to turn

vllm/vllm/platforms/cuda.py

Line 31 in 4f4d427

if pynvml.__file__.endswith("__init__.py"):

into an error.

youkaichao · 2025-02-01T16:04:09Z

a better solution is to make sure we are always using nvidia-ml-py, rather than pynvml, if we can.

I don't know if we can find some advanced python import usage to achieve it.

Fix get_device_name for cuda platforms that return bytes

489fc16

Signed-off-by: mgoin <[email protected]>

hmellor approved these changes Feb 1, 2025

View reviewed changes

This was referenced Feb 1, 2025

[Bugfix] Fix the device string for MoE models. #12565

Closed

Fix device return is bytecode instead of str #12635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix get_device_name for cuda platforms that return bytes #12636

Fix get_device_name for cuda platforms that return bytes #12636

mgoin commented Feb 1, 2025

github-actions bot commented Feb 1, 2025

hmellor left a comment

DarkLight1337 commented Feb 1, 2025

hmellor commented Feb 1, 2025

DarkLight1337 commented Feb 1, 2025 •

edited

Loading

youkaichao commented Feb 1, 2025

youkaichao commented Feb 1, 2025

Fix get_device_name for cuda platforms that return bytes #12636

Are you sure you want to change the base?

Fix get_device_name for cuda platforms that return bytes #12636

Conversation

mgoin commented Feb 1, 2025

github-actions bot commented Feb 1, 2025

hmellor left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Feb 1, 2025

hmellor commented Feb 1, 2025

DarkLight1337 commented Feb 1, 2025 • edited Loading

youkaichao commented Feb 1, 2025

youkaichao commented Feb 1, 2025

DarkLight1337 commented Feb 1, 2025 •

edited

Loading