Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078

auto-d · 2025-01-16T17:29:02Z

The model client documentation suggests this fix for missing tokens in an OpenAIChatCompletionClient streaming response:

set extra_create_args={"stream_options": {"include_usage": True}},

However the (final) message from the server with the requested usage information raises an exception due to an 'empty chunk' in the streaming processor implementation (openai/_openai_client.py).

Code to reproduce in autogen 0.41:

model_client = OpenAIChatCompletionClient(model="gpt-4o-mini", api_key = api_key)
model_client.create_stream(
                messages=["Tell me a story about pirates"],
                extra_create_args={"stream_options": {"include_usage": True}})

async for response in stream: 
    print(response) 
    # ValueError raised by logic in _openai_client.py's `create_stream`.

Workaround for me was setting max_consecutive_empty_chunk_tolerance to 2, which per the comments was included to fix a problem with the Azure endpoint.

The text was updated successfully, but these errors were encountered:

ekzhu · 2025-01-16T21:09:53Z

Could you submit a PR to update the API docs of the create_stream methods of both OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient on your error resolution? cc @MohMaz

auto-d · 2025-01-23T19:28:00Z

Could you submit a PR to update the API docs of the create_stream methods of both OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient on your error resolution? cc @MohMaz

@ekzhu both of those classes inherit their create_stream implementation from BaseOpenAIChatCompletionClient. Do you want the documentation updated there or in the respective class docs?

ekzhu · 2025-01-24T18:45:55Z

@auto-d , let's update the API doc of AzureOpenAIChatCompletionClient

auto-d · 2025-01-24T21:12:05Z

I'm not able to test the Azure behavior. PR'd changes to both classes as well as the user guide. Take what you will! :)

ekzhu · 2025-01-24T22:52:59Z

@auto-d thanks! Let's put it under the BaseOpenAIChatCompletionClient's create_stream method.

github-actions bot added the needs-triage label Jan 16, 2025

rysweet removed the needs-triage label Jan 16, 2025

rysweet assigned jackgerrits and ekzhu Jan 16, 2025

ekzhu added the proj-extensions label Jan 16, 2025

ekzhu added this to the 0.4.x milestone Jan 16, 2025

ekzhu changed the title ~~Streaming with OpenAIChatCompletionClient raises an exception when usage is requested~~ Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance Jan 16, 2025

ekzhu added the documentation Improvements or additions to documentation label Jan 16, 2025

ekzhu unassigned ekzhu and jackgerrits Jan 16, 2025

auto-d added a commit to auto-d/autogen that referenced this issue Jan 24, 2025

documentation updates for microsoft#5078

4dc6f42

auto-d mentioned this issue Jan 24, 2025

Chunk tolerance annotation in streaming completion docs #5190

Open

3 tasks

MohMaz mentioned this issue Jan 26, 2025

promote empty chunk tolerance param into ChatCompletionClient #5210

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078

Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078

auto-d commented Jan 16, 2025 •

edited

Loading

ekzhu commented Jan 16, 2025 •

edited

Loading

auto-d commented Jan 23, 2025

ekzhu commented Jan 24, 2025

auto-d commented Jan 24, 2025

ekzhu commented Jan 24, 2025

Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078

Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078

Comments

auto-d commented Jan 16, 2025 • edited Loading

ekzhu commented Jan 16, 2025 • edited Loading

auto-d commented Jan 23, 2025

ekzhu commented Jan 24, 2025

auto-d commented Jan 24, 2025

ekzhu commented Jan 24, 2025

auto-d commented Jan 16, 2025 •

edited

Loading

ekzhu commented Jan 16, 2025 •

edited

Loading