use CK FA for glm-4v on navi3 #281

jfactory07 · 2024-11-15T05:36:40Z

SDPA is very slow on navi3. Replace it with CK FA for glm-4v on navi3

gshtras · 2024-11-15T15:32:13Z

vllm/model_executor/models/glm4_vision_encoder.py

@@ -79,6 +80,31 @@ def __init__(
        self.output_dropout = torch.nn.Dropout(config.dropout_prob)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        _ON_NAVI3 = "gfx11" in torch.cuda.get_device_properties("cuda").gcnArchName


Calling torch.cuda directly in vllm will not work on platforms other than ROCm. Please refer to platform/rocm.py and its superclass. Also is_navi() in utils.py

The CK FA implementation is specify to navi3 only. but is_navi() include all navi GPUs. how about :
_ON_NAVI3 = current_platform.is_rocm() and "gfx11" in torch.cuda.get_device_properties("cuda").gcnArchName

First checking for is_rocm is the right way to go, just please move this alongside the general is_navi, in utils.py for visibility and others to use as well

Thanks for the suggestion. I have added is_navi3 function in utils.py

gshtras · 2024-11-15T15:32:27Z

vllm/model_executor/models/glm4_vision_encoder.py

+        _ON_NAVI3 = "gfx11" in torch.cuda.get_device_properties("cuda").gcnArchName
+        if _ON_NAVI3:
+            try:
+                # git clone -b howiejay/navi_support https://github.com/ROCm/flash-attention.git


Will this fail without flash_attn built from this particular branch?

yes, only this branch support navi3.

Will any other FA branch trigger a ModuleNotFoundError?
Is there an ETA on when this branch is going to be accepted into mainline FA?

The other FA branch doesn't support navi3. It will be fail to install on gfx1100. I think there isn't plan to upstream this branch.

use CK FA

ae62c82

jfactory07 requested review from shajrawi, gshtras and maleksan85 November 15, 2024 05:37

gshtras reviewed Nov 15, 2024

View reviewed changes

jzhou added 3 commits November 19, 2024 10:41

refine

1b0e54b

fix error

47de136

format

ff58fa1

gshtras approved these changes Nov 20, 2024

View reviewed changes

Merge branch 'develop' into glm-4v-ck-fa

4453d1b

gshtras merged commit 8647e89 into ROCm:develop Nov 20, 2024
7 checks passed

gshtras mentioned this pull request Dec 16, 2024

Enable CK Attention for Navi31 #285

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use CK FA for glm-4v on navi3 #281

use CK FA for glm-4v on navi3 #281

jfactory07 commented Nov 15, 2024

gshtras Nov 15, 2024

jfactory07 Nov 18, 2024

gshtras Nov 18, 2024

jfactory07 Nov 19, 2024

gshtras Nov 15, 2024

jfactory07 Nov 18, 2024

gshtras Nov 18, 2024

jfactory07 Nov 19, 2024

use CK FA for glm-4v on navi3 #281

use CK FA for glm-4v on navi3 #281

Conversation

jfactory07 commented Nov 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment