You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Litellm has the Router class that encapsulates completion with rate limits handling. We can look into using it, because it should allow us to define a RetryPolicy hopefully based on how long the provider has left (though in my reading, it doesn't yet). It does allow to define a fall back LLM in case one provider runs out of tries. (#1263)
Summary
Litellm has the Router class that encapsulates
completion
with rate limits handling. We can look into using it, because it should allow us to define a RetryPolicy hopefully based on how long the provider has left (though in my reading, it doesn't yet). It does allow to define a fall back LLM in case one provider runs out of tries. (#1263)Rate limit headers for OpenAI:
https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers
Rate limit headers for Anthropic:
https://docs.anthropic.com/en/api/rate-limits#response-headers
Technical Design
Replace
completion
direct call to litellm withRouter.completion
Alternatives to Consider
Continue to do it ourselves. Various providers have different rate limits, so our options are:
Fall back LLM:
The text was updated successfully, but these errors were encountered: