Optimized aggregation advisors for streaming scenarios

In the latest round of architectural changes of Advisors in #1422 there now are two types of advisors:
* `CallAroundAdvisor`
* `StreamAroundAdvisor`

In the case of the non-streaming one, it's easy to take some actions based on the entire response. However, the streaming manipulates an entire stream. A next advisor in the chain can also manipulate the entire stream. If a stream advisor in the middle is acting upon each chunk of the response, all should be fine. However, if the advisor is only interested in the entire aggregation it would modify the stream in a way that aggregates everything in a side channel, e.g. using `org.springframework.ai.chat.model.MessageAggregator` class. If multiple advisors perform the same type of aggregation it is inefficient in terms of both time and memory.

Having that, I propose a new interface, `StreamAggregationAdvisor`. Instances of this type would be fed with an aggregation of the original stream of chunks coming back from the model on their way into the application before any other advisors have a chance to manipulate the stream. The aggregation would then be performed once and could deal with the unaltered view of the exchange. The way to implement this behaviour would be based on utilizing the innermost `StreamAroundAdvisor` that is created in the `DefaultChatClient`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimized aggregation advisors for streaming scenarios #1439

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimized aggregation advisors for streaming scenarios #1439

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions