Skip to content

Fix improper streaming in Azure Client #796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

bruno-oliveira
Copy link
Contributor

Ensure streaming works in "real-time"

fix: Ensure streaming chunks are emitted individually to the client

Problem:
The previous implementation combined multiple chunks into a single response, causing all data to be sent to the client at once instead of streaming each chunk individually. This behavior was due to the use of reduce and concatMapIterable, which aggregated the data before emitting it.

Solution:

  1. Removed the use of reduce and concatMapIterable, which were combining chunks into a single response.
  2. Updated the code to use flatMap on the window to ensure each item within the window is processed and emitted individually.
  3. Streamed ChatCompletions directly to maintain the streaming behavior, ensuring that each chunk is processed and passed downstream as soon as it is received.

Changes:

  • Removed the reduce call to avoid combining chunks.
  • Replaced concatMapIterable with flatMap to process each chunk individually.
  • Modified the windowUntil logic to correctly handle function calls and stream each chunk separately.

This change ensures that each chunk of data is streamed to the client individually, providing a smoother and more immediate streaming experience.

Tested:

  • Verified that the client receives each chunk individually as it is streamed.
  • Confirmed that the function call handling logic works correctly with the updated streaming approach.

Ensure streaming works in "real-time"
@bruno-oliveira
Copy link
Contributor Author

@markpollack @joshlong Hello, The streaming issue was not addressed in the previous fix, as the chunks still were arriving all at once. This MR fixes that.

@bruno-oliveira bruno-oliveira changed the title Update AzureOpenAiChatModel.java Fix improper streaming in Azure Client May 30, 2024
@bruno-oliveira
Copy link
Contributor Author

@joshlong @markpollack I think the issue lies with the actual Azure SDK client having a piece of logic that is making the streaming of the chunks a blocking call which essentially makes using streaming with Azure OpenAI and Spring AI impossible atm. Do you think this could get higher priority?

@tzolov
Copy link
Contributor

tzolov commented Jun 18, 2024

@bruno-oliveira I can not confirm your observation. In fact there are integration tests that confirm that the streamed response is indeed chunked for both function and non-function calls:

Please review the above tests and let me know if there is an issue with them.
If you still think there is an issue please write a test that illustrates it.

@bruno-oliveira
Copy link
Contributor Author

@tzolov Thanks for reaching out! It's a pleasure to be able to engage with the community who builds such amazing tools that tons of devs use everyday!

I have created a ticket, not here, but in the Azure SDK repo: Azure/azure-sdk-for-java#40629 (comment)

Note several people confirmed this was an issue and observe this exact behavior. If I make a "streaming call" in Postman, the response "blocks" until all the chunks are "internally streamed" which makes it exactly the same as the normal, non-streaming call.

Initially I started by investigating Spring AI library but that lead me to think that the bug needs to be in Azure SDK after all.

Maybe this one can be closed as its not directly related to Spring AI. However, for all effects and purposes, streaming "doesn't work" at the moment, with the Azure OpenAI client

@timostark
Copy link

@bruno-oliveira can you check #1054 if that solves the issue for you?

@asaikali asaikali added the azure label Nov 11, 2024
@markpollack
Copy link
Member

closing this issue, please reopen if something is still not working as expected.

@markpollack markpollack closed this May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants