[Feature]: Batching in LiteLLM for models that do not have native batching support. #7194

markoff-dev · 2024-12-12T13:48:48Z

The Feature

I wish that LiteLLM can provide an automatic batching at the library level itself, allowing them to execute queries from the transferred file in the same way as it is implemented for models with native batching support.

A similar request was made here: #361 (comment)

Question:
Are you considering adding similar functionality at the LiteLLM level?

Motivation, pitch

Problem:
Many LiteLLM usage scenarios require multiple model queries to process large datasets. However, some models do not support native batching, which leads to:

Increased network overhead (a lot of requests */chat/completions).
Longer response time for a large number of requests.

At the moment, LiteLLM supports batching for Azure OpenAI, OpenAI, Vertex AI providers, but this feature is not available for models that do not have batching support (for example, models with the OpenAI interface do not have /file and /batches methods).

Are you a ML Ops Team?

No

Twitter / LinkedIn details

No response

markoff-dev added the enhancement New feature or request label Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Batching in LiteLLM for models that do not have native batching support. #7194

[Feature]: Batching in LiteLLM for models that do not have native batching support. #7194

markoff-dev commented Dec 12, 2024

[Feature]: Batching in LiteLLM for models that do not have native batching support. #7194

[Feature]: Batching in LiteLLM for models that do not have native batching support. #7194

Comments

markoff-dev commented Dec 12, 2024

The Feature

Motivation, pitch

Are you a ML Ops Team?

Twitter / LinkedIn details