Use litellm Router for rate limiting and/or fallback LLMs #4056

enyst · 2024-09-25T22:19:36Z

Summary

Litellm has the Router class that encapsulates completion with rate limits handling. We can look into using it, because it should allow us to define a RetryPolicy hopefully based on how long the provider has left (though in my reading, it doesn't yet). It does allow to define a fall back LLM in case one provider runs out of tries. (#1263)

Rate limit headers for OpenAI:
https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers

Rate limit headers for Anthropic:
https://docs.anthropic.com/en/api/rate-limits#response-headers

Technical Design

Replace completion direct call to litellm with Router.completion

Alternatives to Consider

Continue to do it ourselves. Various providers have different rate limits, so our options are:

don't get the remaining time, and think again of some sensible defaults, user-configurable; better documentation
get the remaining time from liteLLM

Fall back LLM:

do it ourselves
configure litellm

The text was updated successfully, but these errors were encountered:

github-actions · 2024-11-02T01:58:06Z

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions · 2024-11-12T01:57:17Z

This issue was closed because it has been stalled for over 30 days with no activity.

enyst mentioned this issue Sep 26, 2024

Refactor llm.py #4057

Merged

tobitege mentioned this issue Sep 28, 2024

(feat) Introduce Router (litellm) into LLM and LLMConfig classes #4109

Closed

mamoodi added the enhancement New feature or request label Oct 2, 2024

github-actions bot added the Stale Inactive for 30 days label Nov 2, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2024

enyst mentioned this issue Dec 10, 2024

feat: Configure fallback LLMs for rate limit handling #5494

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use litellm Router for rate limiting and/or fallback LLMs #4056

Use litellm Router for rate limiting and/or fallback LLMs #4056

enyst commented Sep 25, 2024 •

edited

Loading

github-actions bot commented Nov 2, 2024

github-actions bot commented Nov 12, 2024

Use litellm Router for rate limiting and/or fallback LLMs #4056

Use litellm Router for rate limiting and/or fallback LLMs #4056

Comments

enyst commented Sep 25, 2024 • edited Loading

github-actions bot commented Nov 2, 2024

github-actions bot commented Nov 12, 2024

enyst commented Sep 25, 2024 •

edited

Loading