Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fp8 support for llama model family on Navi4x #245

Merged
merged 4 commits into from
Oct 25, 2024
Merged

Conversation

qli88
Copy link

@qli88 qli88 commented Oct 25, 2024

[MISC] Add FP8 support for llama model family on Navi4x

vllm/utils.py Show resolved Hide resolved
@gshtras
Copy link
Collaborator

gshtras commented Oct 25, 2024

Great job!
Please fix linters and consider the proposed navi check change, and then it's GTG

       2. change implementation of is_navi4x (
          from env variable to cuda query)
@qli88 qli88 requested a review from gshtras October 25, 2024 19:47
vllm/model_executor/models/llama.py Show resolved Hide resolved
vllm/model_executor/models/llama.py Outdated Show resolved Hide resolved
vllm/model_executor/models/llama.py Show resolved Hide resolved
vllm/model_executor/models/llama.py Show resolved Hide resolved
vllm/utils.py Show resolved Hide resolved
       2. Remove unnecessary detection of Navi4x platform;
@qli88 qli88 requested a review from gshtras October 25, 2024 21:26
@qli88 qli88 merged commit 4bba092 into main Oct 25, 2024
16 of 17 checks passed
@gshtras gshtras deleted the qiang-navi4x-fp8-llama branch October 25, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants