Use whole history in case of undetermined tokenization of sequence #1254

sbalandi · 2024-11-26T12:20:37Z

Task: CVS-157295

fist commit is cherry-pick from Use whole history in case of undetermined tokenization of sequence #1268 and Fix trim kv cache for chat vlm #1361
next commit includes applying comments from Use whole history in case of undetermined tokenization of sequence #1268 and adding usage of kv cache for LLM

src/cpp/src/llm_pipeline.cpp

src/cpp/src/utils.cpp

src/cpp/src/llm_pipeline.cpp

src/cpp/src/visual_language/inputs_embedder.cpp

src/cpp/src/visual_language/pipeline.cpp

Wovchena · 2024-11-27T09:21:02Z

src/cpp/src/llm_pipeline.cpp

Not for this PR, but we need tests for that change. I remember you said random miniCPM diverges. This PR probably fixes that, so your case can be used as a test for VLM. LLM needs its test as well, maybe you can provoke a random weights model to generate such tokens.

src/cpp/src/llm_pipeline.cpp

sbalandi · 2024-11-27T18:35:11Z

comments have been addressed - #1268

ilya-lavrenov · 2024-12-03T13:13:44Z

Let's sync with #1268 and address remaining comments from that PR if they are valid

Wovchena · 2024-12-13T13:39:42Z

src/cpp/src/utils.cpp

+        ov::Tensor old_tensor = state.get_state();
+        // [BATCH_SIZE, num_kv_heads, seq_len, head_size]
+        auto shape = old_tensor.get_shape();
+        shape[2] -= remove_from_end;


Unfortunately, sequence is represented by another dimension sometimes. See

openvino.genai/samples/cpp/prompt_lookup_decoding_lm/prompt_lookup_decoding_lm.cpp

Line 116 in d189eb7

shape[seq_len_axis] = new_seq_len;

thanks for the point, added runtime search seq_len_axis

github-actions bot added category: visual language Visual language pipeline category: LLM LLM pipeline (stateful, static) no-match-files labels Nov 26, 2024

sbalandi force-pushed the tok_hist branch 2 times, most recently from 49a4c9b to 05b3302 Compare November 26, 2024 13:52

sbalandi marked this pull request as ready for review November 26, 2024 13:52

sbalandi requested review from Wovchena and ilya-lavrenov November 26, 2024 13:52

ilya-lavrenov added this to the 2025.0 milestone Nov 26, 2024

ilya-lavrenov added the port to LTS PR needs to be ported to LTS label Nov 26, 2024

ilya-lavrenov self-assigned this Nov 26, 2024

Wovchena requested changes Nov 27, 2024

View reviewed changes

ilya-lavrenov reviewed Nov 27, 2024

View reviewed changes

ilya-lavrenov assigned Wovchena Nov 27, 2024

ilya-lavrenov modified the milestones: 2025.0, 2024.5.1 Nov 27, 2024

ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Dec 3, 2024

ilya-lavrenov modified the milestones: 2024.6, 2025.0 Dec 3, 2024

sbalandi force-pushed the tok_hist branch 3 times, most recently from ddd706c to 266c4b6 Compare December 12, 2024 12:13

sbalandi mentioned this pull request Dec 12, 2024

fill prompt for sampler analysis with real tokens in VLM pipeline #1247

Merged

ilya-lavrenov requested a review from Wovchena December 13, 2024 08:28

ilya-lavrenov approved these changes Dec 13, 2024

View reviewed changes

ilya-lavrenov mentioned this pull request Dec 13, 2024

Use whole history in case of undetermined tokenization of sequence #1268

Merged

Wovchena requested changes Dec 13, 2024

View reviewed changes

sbalandi force-pushed the tok_hist branch from 266c4b6 to c7ba420 Compare December 13, 2024 18:10

sbalandi force-pushed the tok_hist branch 2 times, most recently from c12aee5 to 05305e3 Compare December 15, 2024 12:29

sbalandi added 3 commits December 15, 2024 13:28

Use whole history in case of undetermined tokenization of sequence

352c344

Apply comments

ffbba27

add kv_cache_seq_length axis choice

05305e3

ilya-lavrenov requested a review from Wovchena December 16, 2024 07:25

Wovchena approved these changes Dec 16, 2024

View reviewed changes

Wovchena enabled auto-merge December 16, 2024 07:50

Wovchena added this pull request to the merge queue Dec 16, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 16, 2024

ilya-lavrenov added this pull request to the merge queue Dec 16, 2024

Merged via the queue into openvinotoolkit:master with commit 9e9b409 Dec 16, 2024
59 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use whole history in case of undetermined tokenization of sequence #1254

Use whole history in case of undetermined tokenization of sequence #1254

sbalandi commented Nov 26, 2024 •

edited by ilya-lavrenov

Loading

Wovchena Nov 27, 2024

sbalandi commented Nov 27, 2024 •

edited

Loading

ilya-lavrenov commented Dec 3, 2024

Wovchena Dec 13, 2024

sbalandi Dec 13, 2024

Use whole history in case of undetermined tokenization of sequence #1254

Use whole history in case of undetermined tokenization of sequence #1254

Conversation

sbalandi commented Nov 26, 2024 • edited by ilya-lavrenov Loading

Wovchena Nov 27, 2024

Choose a reason for hiding this comment

sbalandi commented Nov 27, 2024 • edited Loading

ilya-lavrenov commented Dec 3, 2024

Wovchena Dec 13, 2024

Choose a reason for hiding this comment

sbalandi Dec 13, 2024

Choose a reason for hiding this comment

sbalandi commented Nov 26, 2024 •

edited by ilya-lavrenov

Loading

sbalandi commented Nov 27, 2024 •

edited

Loading