-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance #5078
Comments
Could you submit a PR to update the API docs of the |
@ekzhu both of those classes inherit their |
@auto-d , let's update the API doc of |
I'm not able to test the Azure behavior. PR'd changes to both classes as well as the user guide. Take what you will! :) |
@auto-d thanks! Let's put it under the BaseOpenAIChatCompletionClient's |
The model client documentation suggests this fix for missing tokens in an OpenAIChatCompletionClient streaming response:
However the (final) message from the server with the requested usage information raises an exception due to an 'empty chunk' in the streaming processor implementation (openai/_openai_client.py).
Code to reproduce in autogen 0.41:
Workaround for me was setting
max_consecutive_empty_chunk_tolerance
to 2, which per the comments was included to fix a problem with the Azure endpoint.The text was updated successfully, but these errors were encountered: