-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLM] Support LLM routing through notdiamond #4184
Comments
OpenHands started fixing the issue! You can monitor the progress here. |
@xingyaoww - also see #4109 where litellm's Router is being incorporated and also a config structure that could maybe used here |
OpenHands started fixing the issue! You can monitor the progress here. |
An attempt was made to automatically fix this issue, but it was unsuccessful. A branch named 'openhands-fix-issue-4184' has been created with the attempted changes. You can view the branch here. Manual intervention may be required. |
Quick point of discussion: do we want to implement this within OpenHands? Or should we host a server with the router, like we host our proxy server for All Hands AI? Personally I think the latter might be better. Doing this on the client side means that users have to acquire several different API keys and somehow configure them. This seems like a pain UI-wise, especially given that currently our configuration behavior is hard to understand: #3220 |
Good point - but another thing is it might be tricky to calculate costs (especially with all the prompt caching and stuff.. for the router than :(. Another potential idea is to do this with LiteLLM router 🤔 |
Yeah, maybe NotDiamond could be implemented as a custom routing strategy within the LiteLLM proxy? |
yeah seems like a better approach (if we can get the cost propagation to work correctly). Close this for now then |
Hi @xingyaoww @neubig, just caught this issue. While our LLMConfigs accept prices, they only help tune cost tradeoffs. You won't have to provide that parameter for public models - we track prices for every model we support. Beyond this, we're also happy to help you set up a routing integration with Not Diamond's API. Just let me know if that interests you. As for LiteLLM, we've actually been discussing an integration with them since July! While waiting on their feedback, we've also implemented a simple integration in our Python client which might help you. |
Thanks @acompa , I do think we'd be interested in at least running an evaluation where we use NotDiamond as a backend and see if the results are better/cheaper than what we get now. If your API offers OpenAI compatible endpoints it should be pretty easy (we haven't looked super-carefully yet). |
We do accept OpenAI-style requests with |
Cool, thanks! I'll re-open this as I think that whatever way we implement it'd be interesting to see if model routing helps. |
Excellent. As you begin your evaluation, note that we offer two approaches to AI model routing: Our out-of-the-box router has been trained on generalist, cross-domain data (including coding and non-coding tasks) to provide a strong "multi-model" multidisciplinary experience. Secondly, OpenHands focuses on development applications, and so you might benefit from specialized routing trained on the distribution of your proprietary data. We also offer custom routing to serve these types of domain-targeted use cases as a higher-performance option beyond out-of-the-box routing. We're happy to answer questions or support you in whichever of these approaches you evaluate. |
@neubig we could also look into the https://github.com/Not-Diamond/RoRF/ repo to start with (pair-wise routing) to start with? |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
I think the NotDiamond folks are working on this still. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been stalled for over 30 days with no activity. |
I think this is still in progress? |
Yes, it is! |
What problem or use case are you trying to solve?
Not Diamond intelligently identifies which LLM is best-suited to respond to any given query. We want to implement a mechanism in OpenHands to support this type of "LLM" selector.
Describe the UX of the solution you'd like
Ideally, use should define a "LLMRouter" as a special type of LLM with some special configs (e.g., multiple keys for different providers). And user can just put in keys, and select that router, and OpenHands will automatically use that going forward.
Do you have thoughts on the technical implementation?
Modify https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/llm/llm.py, as well as config related files under https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/core/config.
You should probably use
model_select
(from notdiamond API) rather thancreate
to be compatible with existing LiteLLM calls.Describe alternatives you've considered
Additional context
Here's the documentation from NotDiamond
The text was updated successfully, but these errors were encountered: