support eos_token list in turbomind #3044

irexyc · 2025-01-16T11:53:13Z

Motivation

MinLengthLogitsProcessor in transformers supports list of eos_token_id

Modification

diff --git a/benchmark/profile_throughput.py b/benchmark/profile_throughput.py
index 2e4d2a3b..9fc9605f 100644
--- a/benchmark/profile_throughput.py
+++ b/benchmark/profile_throughput.py
@@ -108,6 +108,7 @@ class Engine:
                 session_id,
                 input_ids=input_ids,
                 gen_config=GenerationConfig(max_new_tokens=output_seqlen,
+                                            min_new_tokens=output_seqlen - 1,
                                             temperature=temperature,
                                             top_p=top_p,
                                             top_k=top_k,

FT_NVTX=ON /mnt/141/2024.5.1/target-linux-x64/nsys profile -t cuda,nvtx,osrt,cudnn,cublas -o output -f true --stats true  python ../benchmark/profile_throughput.py -n 20000 /home/chenxin/ShareGPT_V3_unfiltered_cleaned_split.json /home/chenxin/Llama-3.2-1B-Instruct


 Time (%)  Total Time (ns)  Instances  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)                                                  Name                                                
 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------------------------------------------------------------------------------------
      0.0         45986089      16761    2743.6    2656.0      2272      3840        276.0  void turbomind::batchApplyMinLengthPenalty<float>(T1 *, const int *, const int *, const int *, int,…

(sz=1)0.0         48248303      16797    2872.4    2816.0      2464      3808        237.3  void turbomind::batchApplyMinLengthPenalty<float>(T1 *, int, const int *, const int *, int, const i…
(sz=2)0.0         45838054      16854    2719.7    2656.0      2240      3552        170.6  void turbomind::batchApplyMinLengthPenalty<float>(T1 *, int, const int *, const int *, int, const i…
(sz=4)0.0         45971238      16826    2732.2    2688.0      2304      3584        167.5  void turbomind::batchApplyMinLengthPenalty<float>(T1 *, int, const int *, const int *, int, const i…
(sz=8)0.0         45888590      16819    2728.4    2656.0      2240      3680        171.4  void turbomind::batchApplyMinLengthPenalty<float>(T1 *, int, const int *, const int *, int, const i…

src/turbomind/layers/sampling_layers/LogitsProcessorLayer.cc

src/turbomind/kernels/sampling_penalty_kernels.h

src/turbomind/kernels/sampling_penalty_kernels.cu

lvhan028 · 2025-01-22T13:10:34Z

Overall LGTM

irexyc added 6 commits January 16, 2025 03:39

support list of eos_id

0a0c7fc

fix wrong size

7288a7a

fix lint

1226fce

fix lint

d1f6fdf

Merge remote-tracking branch 'origin/main' into eos_id_list

3a7078c

fix ut

f13181a

lvhan028 added the improvement label Jan 17, 2025

support GenerationConfig.eos_token_id is None

cc06583

lvhan028 requested review from lvhan028 and lzhangzz January 20, 2025 03:40

lvhan028 reviewed Jan 22, 2025

View reviewed changes

src/turbomind/layers/sampling_layers/LogitsProcessorLayer.cc Outdated Show resolved Hide resolved

lvhan028 reviewed Jan 22, 2025

View reviewed changes

src/turbomind/kernels/sampling_penalty_kernels.h Outdated Show resolved Hide resolved

lvhan028 reviewed Jan 22, 2025

View reviewed changes

src/turbomind/kernels/sampling_penalty_kernels.cu Outdated Show resolved Hide resolved

lvhan028 reviewed Jan 22, 2025

View reviewed changes

src/turbomind/kernels/sampling_penalty_kernels.cu Outdated Show resolved Hide resolved

irexyc added 2 commits January 23, 2025 06:57

rename

5b4d9df

remove unused

c34def1

lvhan028 approved these changes Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support eos_token list in turbomind #3044

support eos_token list in turbomind #3044

irexyc commented Jan 16, 2025

lvhan028 commented Jan 22, 2025

support eos_token list in turbomind #3044

Are you sure you want to change the base?

support eos_token list in turbomind #3044

Conversation

irexyc commented Jan 16, 2025

Motivation

Modification

lvhan028 commented Jan 22, 2025