support yarn in turbomind backend #2519

irexyc · 2024-09-26T05:45:58Z

Motivation

support yarn in torbumind backend

https://github.com/huggingface/transformers/blob/f2c388e3f946862f657acc1e21b272ec946fc66c/src/transformers/modeling_rope_utils.py#L163

lvhan028 · 2024-09-26T05:51:35Z

@irexyc may check the pytorch engine as well. Please support it if pytorch engine has no such feature

lmdeploy/pytorch/nn/rotary_embedding.py

lzhangzz · 2024-10-24T05:06:24Z

src/turbomind/kernels/attention/rotary_embedding.h

+                auto  freq       = inv_freq_[i / 2];
+                float alpha      = ((idx + i) / 2 - yarn_ramp_min) / (yarn_ramp_max - yarn_ramp_min);
+                alpha            = fmaxf(0.f, fminf(1.f, alpha));
+                inv_freq_[i / 2] = freq * (1 - alpha + alpha / factor);


All these expensive divisions can be done in host code. See how the llama3 rope is implemented just a few lines above.

lvhan028 · 2024-10-25T08:00:17Z

lmdeploy/turbomind/deploy/config.py

@@ -53,13 +53,16 @@ class ModelConfig:
 class AttentionConfig:
    rotary_embedding: int = 128
    rope_theta: float = 10000.0
+    attention_factor: float = None


https://github.com/huggingface/transformers/blob/f2c388e3f946862f657acc1e21b272ec946fc66c/src/transformers/modeling_rope_utils.py#L189
There is another parameter "partial_rotary_factor"

It seems the partial_rotary_factor changes the dim of default, dynamic ntk and yarn

lvhan028 · 2024-10-25T08:02:25Z

lmdeploy/turbomind/deploy/source_model/llama.py

@@ -236,6 +239,10 @@ def model_info(self):
                    else llama3_scaling_type
                if scaling_type == 'dynamic':
                    use_dynamic_ntk = 1
+                attention_factor = model_arg['rope_scaling'].get(
+                    'attention_factor', None)


https://github.com/huggingface/transformers/blob/f2c388e3f946862f657acc1e21b272ec946fc66c/src/transformers/modeling_rope_utils.py#L198

attention_factor = config.rope_scaling.get("attention_factor") if attention_factor is None: attention_factor = 0.1 * math.log(factor) + 1.0

The previous code had this logic. And add this when converting the model as well.

This reverts commit cc4cce7.

* support yarn in turbomind backend * update qwen2 model to support yarn rope in pytorch backend * use mul * refactor export rope params * support partial_rotary_factor * fix lint * fix rope type * Revert "support partial_rotary_factor" This reverts commit cc4cce7.

support yarn in turbomind backend

8ec2a5b

update qwen2 model to support yarn rope in pytorch backend

08bbb16

lvhan028 requested review from lzhangzz and grimoire October 7, 2024 13:59

lvhan028 added the enhancement New feature or request label Oct 7, 2024

grimoire reviewed Oct 8, 2024

View reviewed changes

lmdeploy/pytorch/nn/rotary_embedding.py Show resolved Hide resolved

lzhangzz reviewed Oct 24, 2024

View reviewed changes

use mul

152d4be

lvhan028 reviewed Oct 25, 2024

View reviewed changes

grimoire approved these changes Oct 25, 2024

View reviewed changes

irexyc added 7 commits October 25, 2024 08:57

refactor export rope params

a22b52d

support partial_rotary_factor

cc4cce7

fix lint

d8e9b4f

Merge remote-tracking branch 'origin/main' into yarn

64a938d

fix rope type

dc3fd8d

Revert "support partial_rotary_factor"

7e64d0a

This reverts commit cc4cce7.

Merge remote-tracking branch 'origin/main' into yarn

a76f151

lzhangzz approved these changes Nov 4, 2024

View reviewed changes

lvhan028 merged commit e557f05 into InternLM:main Nov 4, 2024
7 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support yarn in turbomind backend #2519

support yarn in turbomind backend #2519

irexyc commented Sep 26, 2024

lvhan028 commented Sep 26, 2024

lzhangzz Oct 24, 2024

lvhan028 Oct 25, 2024

irexyc Oct 25, 2024

lvhan028 Oct 25, 2024

irexyc Oct 25, 2024

support yarn in turbomind backend #2519

support yarn in turbomind backend #2519

Conversation

irexyc commented Sep 26, 2024

Motivation

lvhan028 commented Sep 26, 2024

lzhangzz Oct 24, 2024

Choose a reason for hiding this comment

lvhan028 Oct 25, 2024

Choose a reason for hiding this comment

irexyc Oct 25, 2024

Choose a reason for hiding this comment

lvhan028 Oct 25, 2024

Choose a reason for hiding this comment

irexyc Oct 25, 2024

Choose a reason for hiding this comment