Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of LiteLLM Router for load balancing and fallbacks #1570

Open
denisergashbaev opened this issue Oct 1, 2024 · 10 comments
Open

Use of LiteLLM Router for load balancing and fallbacks #1570

denisergashbaev opened this issue Oct 1, 2024 · 10 comments

Comments

@denisergashbaev
Copy link

denisergashbaev commented Oct 1, 2024

As DSPy is using LiteLLM internally, I wonder how to use the LiteLLM router. In particular, I would like to add load balancing and fallbacks via LiteLLM.

Another example. LiteLLM provides rate limit aware routing strategy that routes the call to the deployment with the lowest tokens per minute value (see BerriAI/litellm#4510, https://docs.litellm.ai/docs/routing#advanced---routing-strategies-%EF%B8%8F). I would want to use the router

Thank you

@denisergashbaev denisergashbaev changed the title How to user LiteLLM router for load balancing and fallbacks? Use of LiteLLM Router for load balancing and fallbacks? Oct 1, 2024
@denisergashbaev denisergashbaev changed the title Use of LiteLLM Router for load balancing and fallbacks? Use of LiteLLM Router for load balancing and fallbacks Oct 1, 2024
@okhat
Copy link
Collaborator

okhat commented Oct 1, 2024

Thanks! Maybe just launch their server and connect to it via the client dspy.LM? i.e., DSPy doesn't need to be invovled

@okhat
Copy link
Collaborator

okhat commented Oct 2, 2024

@denisergashbaev
Copy link
Author

Thank you, Omar! I think it makes sense to expose router of LiteLLM as well. Under some circumstances, one would not want to run a separate proxy server and prefer using the LiteLLM Router for fallbacks, load balancing, and request prioritization

@okhat
Copy link
Collaborator

okhat commented Oct 2, 2024

@denisergashbaev Sorry, I'm mistaken about the nature of the LiteLLM router. I assumed it was inherently a proxy.

It's actually just a client-side thing, indeed: https://docs.litellm.ai/docs/routing

@okhat
Copy link
Collaborator

okhat commented Oct 2, 2024

Yes, I think we should support this. It seems like we should inherit dspy.LM and just accept a list of models instead of one model. This seems cool. Do you need this soon? We'd certainly appreciate a PR.

@zhaohan-dong
Copy link
Contributor

zhaohan-dong commented Oct 8, 2024

Down to work on this if needed. @okhat maybe another function in dspy.LM similar to litellm_completion? Can also do a router kwarg to pass in a litellm.Router object.

@okhat
Copy link
Collaborator

okhat commented Oct 8, 2024

@zhaohan-dong Thanks a lot! How do you envision the interface looking like? Let's agree on the right API before doing anything intensive :D

@zhaohan-dong
Copy link
Contributor

Exactly what I hoped to ascertain I think litellm tries to have similar signature for Router as the plain generation methods. So I'm thinking it could be progressively do dependency injection as first step:

  • An argument router: Optional[litellm.Router] = None
  • If router != None, the inference call would invoke router.text_completion()
  • If model not in the router's model_list, throw error (Not 100% this is the cleanest way)

Otherwise could inherit LM with AnotherClass(model_list=model_list, **kwargs), so people who wnat to use the router would use that, and not affecting other users. Not sure what's the best naming/file to put it in.

Happy to proceed either way or do something else you'd suggest.

Fundamentally I see the plain litellm.text_generation() as a special case of Router.text_generation(), where there's only one model in the model_list.

@okhat
Copy link
Collaborator

okhat commented Oct 10, 2024

Thanks a lot @zhaohan-dong ! I like the idea of a class that inherits from dspy.LM

@zhaohan-dong
Copy link
Contributor

Awesome! Maybe RoutedLM as name?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants