-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: include_usage
for streaming doesn't work due to an SDK client bug
#266
Comments
Thank you for reaching out, @janaka ! |
The problem is not with the inference API from what I can tell though. It's a dotnet library issue I'm seeing. The request going out does not have include_usage = true |
@joseharriaga assuming by In any case I'm not expecting the REST API version changing to adjust the client lib behaviour. I don't see where this option is switch off in the client (is it?). So I expect the http request going out to have |
@janaka Your understanding is correct! Because |
@joseharriaga that's awesome! will that be in 2.1.0-beta.2 ? will it be soon like in the next couple weeks, this quarter, or next quarter, or beyond next quarter? |
@janaka Yes, we're aiming to include it in 2.1.0-beta.2, which we expect to release between this week and next week. |
@janaka This is now supported in version 2.1.0-beta.2, which we released yesterday. |
Service
Azure OpenAI
Describe the bug
This feature seems to be switched on to true by default in the OpenAI client. However, the StreamingChatCompletionUpdate.Usage property is always null.
Before CompleteChatStreamingAsync(messages, options) is called when I inspect the options object, Stream=null and StreamOptions.IncludeUsage=true. However, immediately after CompleteChatStreamingAsync(messages, options) is called when I inspect the options object, Stream=true and SteamOptions is null.
Inspecting the sdk source code, this appears to be a bug in how the internal constructor code is implemented.
Azure client version: 2.1.0 Beta 1
OpenAI Client version: implicit (>= 2.1.0-beta.1)
OpenAI Inference API version: AzureOpenAIClientOptions.ServiceVersion.V2024_08_01_Preview
OpenAI API provider : Azure OpenAI
Model name: GPT-4o 08-06
Just before calling CompleteChatStreamingAsync:
Just after calling CompleteChatStreamingAsync:
Steps to reproduce
Use the v2.1.* dotnet client to make a streaming call this should return usage. But it doesn't.
Making a Postman request, with the correct options for include usage, to the same deployment does return token usage counts.
Code snippets
No response
OS
macOS
.NET version
netstandard 2.0
Library version
2.1.0-beta1
The text was updated successfully, but these errors were encountered: