Fix streaming hieroglyphs #1492

pavel-esir · 2025-01-07T13:37:24Z

When only until n last characters are printed then they are cut in the middle and we get invalid python utf8 byte sequence and they are corrupted to �.
Print until n last tokens! This fixed this issue.

visual_language_chat.py ./tiny-random-minicpmv-2_6 ./images <<< $'Describe the images?'
��������������������룅 encouraging룅 encouraging룅 encouraging룅 encouraging룅 encouraging룅 encouraging룅 encouraging

After the fix

룅튜룅튜룅튜룅튜룅튜룅 encouraging룅 encouraging룅 encouraging룅 encouraging룅 encouraging룅 encouraging

CVS-159227

ilya-lavrenov · 2025-01-07T14:06:32Z

is it possible to add tests for streamer?
e.g. w/o model put some tokens and see how they are processed via callback (i.e. we don't receive broken symbols)

src/cpp/src/text_callback_streamer.cpp

pavel-esir · 2025-01-13T08:54:31Z

is it possible to add tests for streamer? e.g. w/o model put some tokens and see how they are processed via callback (i.e. we don't receive broken symbols)

Unfortunately, it was quite difficult to add calling python from c++ and be sure that environment is correct with necessary packages to convert with optimum-cli to openivno_detokenizer.xml.

I upgraded existing tests for streamer and ensure that streamed results are the same as from generate.

ilya-lavrenov · 2025-01-13T09:00:34Z

tests/python_tests/test_generate_api.py

@@ -361,25 +361,47 @@ def test_callback_batch_fail(callback):
        pipe.generate(['1', '2'], ov_genai.GenerationConfig(), callback)


-@pytest.mark.parametrize("callback", [print, user_defined_callback, lambda subword: print(subword)])
+class StremerWithResults:


I think we need to move this to

openvino.genai/tests/python_tests/common.py

Lines 328 to 359 in b62145b

def run_llm_pipeline(

models_path : Path,

prompts: List[str],

generation_config : GenerationConfig,

use_cb : bool = False

) -> List[GenerationResult]:

properties = get_default_properties()

if use_cb:

properties['scheduler_config'] = SchedulerConfig()

ov_pipe = LLMPipeline(models_path, device='CPU', **properties)

generate_outputs : DecodedResults = ov_pipe.generate(inputs=prompts, generation_config=generation_config)

index = 0

generation_results = []

for _ in prompts:

generation_result = GenerationResult()

generation_result.m_generation_ids = generate_outputs.texts[index : index + generation_config.num_return_sequences]

# sequences_scores are available only for beam search case

if generation_config.is_beam_search():

generation_result.m_scores = generate_outputs.scores[index : index + generation_config.num_return_sequences]

generation_results.append(generation_result)

index += generation_config.num_return_sequences

del ov_pipe

shutil.rmtree(models_path)

return generation_results

because it covers significantly more cases (but limit it to cases when we have a single batch and num_return_sequences is 1)

I agree. But it's present only in master. This PR is for release branch.
I will do that when cherry-pick to master.

ilya-lavrenov · 2025-01-13T09:01:51Z

tests/python_tests/test_vlm_api.py

-            pipe.generate(prompt, generation_config=get_greedy(), streamer=streamer)
+            result_from_streamer = []
+            res = pipe.generate(prompt, generation_config=get_greedy(), streamer=streamer)
+        assert res.texts[0] == ''.join(result_from_streamer)


I think we don't need to duplicate it in VLM pipeline, because internally both VLM and LLM use the same TextStreamer

This VLM model is a good example where this streamer was broken in the first place but since we were not comparing here we didn't catch it from the beginning. This assert comparisons costs us almost nothing, but can prevent from mistakes.

…10-llamafied_ov and all existing UTF8 problems

tests/python_tests/test_vlm_api.py

tests/python_tests/test_generate_api.py

tests/python_tests/test_vlm_api.py

Co-authored-by: Ilya Lavrenov <[email protected]>

tests/python_tests/test_vlm_api.py

pavel-esir added the bug Something isn't working label Jan 7, 2025

pavel-esir added this to the 2024.6 milestone Jan 7, 2025

pavel-esir requested review from Wovchena and apaniukov January 7, 2025 13:37

github-actions bot added the no-match-files label Jan 7, 2025

fix streaming

4e4f704

ilya-lavrenov assigned Wovchena and apaniukov Jan 7, 2025

Wovchena reviewed Jan 8, 2025

View reviewed changes

src/cpp/src/text_callback_streamer.cpp Outdated Show resolved Hide resolved

fix

dc846e4

github-actions bot added category: visual language Visual language pipeline category: LLM LLM pipeline (stateful, static) category: samples GenAI samples labels Jan 13, 2025

pavel-esir requested a review from Wovchena January 13, 2025 08:54

ilya-lavrenov reviewed Jan 13, 2025

View reviewed changes

add streamer result comparison in tests; fix broken glm4-nano-chat-v0…

917a0a6

…10-llamafied_ov and all existing UTF8 problems

ilya-lavrenov reviewed Jan 13, 2025

View reviewed changes

tests/python_tests/test_vlm_api.py Outdated Show resolved Hide resolved

ilya-lavrenov reviewed Jan 13, 2025

View reviewed changes

tests/python_tests/test_generate_api.py Outdated Show resolved Hide resolved

pavel-esir commented Jan 13, 2025

View reviewed changes

tests/python_tests/test_vlm_api.py Outdated Show resolved Hide resolved

Apply suggestions from code review

5bd05c1

Co-authored-by: Ilya Lavrenov <[email protected]>

pavel-esir commented Jan 13, 2025

View reviewed changes

tests/python_tests/test_vlm_api.py Outdated Show resolved Hide resolved

Apply suggestions from code review

252a8f7

ilya-lavrenov approved these changes Jan 13, 2025

View reviewed changes

Wovchena approved these changes Jan 13, 2025

View reviewed changes

fix test

111cf2f

andrei-kochin modified the milestones: 2024.6, 2025.0 Jan 13, 2025

pavel-esir added the port to master PR needs to be ported to master from release branch label Jan 13, 2025

ilya-lavrenov enabled auto-merge January 13, 2025 13:32

ilya-lavrenov added this pull request to the merge queue Jan 13, 2025

Merged via the queue into openvinotoolkit:releases/2024/6 with commit 9cf1601 Jan 13, 2025
51 of 52 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix streaming hieroglyphs #1492

Fix streaming hieroglyphs #1492

pavel-esir commented Jan 7, 2025 •

edited by andrei-kochin

Loading

ilya-lavrenov commented Jan 7, 2025 •

edited

Loading

pavel-esir commented Jan 13, 2025

ilya-lavrenov Jan 13, 2025 •

edited

Loading

pavel-esir Jan 13, 2025

ilya-lavrenov Jan 13, 2025

pavel-esir Jan 13, 2025

	def run_llm_pipeline(
	models_path : Path,
	prompts: List[str],
	generation_config : GenerationConfig,
	use_cb : bool = False
	) -> List[GenerationResult]:
	properties = get_default_properties()
	if use_cb:
	properties['scheduler_config'] = SchedulerConfig()

	ov_pipe = LLMPipeline(models_path, device='CPU', **properties)

	generate_outputs : DecodedResults = ov_pipe.generate(inputs=prompts, generation_config=generation_config)

	index = 0
	generation_results = []

	for _ in prompts:
	generation_result = GenerationResult()

	generation_result.m_generation_ids = generate_outputs.texts[index : index + generation_config.num_return_sequences]
	# sequences_scores are available only for beam search case
	if generation_config.is_beam_search():
	generation_result.m_scores = generate_outputs.scores[index : index + generation_config.num_return_sequences]
	generation_results.append(generation_result)

	index += generation_config.num_return_sequences

	del ov_pipe
	shutil.rmtree(models_path)

	return generation_results

Fix streaming hieroglyphs #1492

Fix streaming hieroglyphs #1492

Conversation

pavel-esir commented Jan 7, 2025 • edited by andrei-kochin Loading

ilya-lavrenov commented Jan 7, 2025 • edited Loading

pavel-esir commented Jan 13, 2025

ilya-lavrenov Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

pavel-esir Jan 13, 2025

Choose a reason for hiding this comment

ilya-lavrenov Jan 13, 2025

Choose a reason for hiding this comment

pavel-esir Jan 13, 2025

Choose a reason for hiding this comment

pavel-esir commented Jan 7, 2025 •

edited by andrei-kochin

Loading

ilya-lavrenov commented Jan 7, 2025 •

edited

Loading

ilya-lavrenov Jan 13, 2025 •

edited

Loading