Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: litellm doesn't support function calling model from OpenRouter. bug cause codeactagent couldn't interact with internet solely without ask browser agent for help #4820

Closed
1 task done
Tomlili43 opened this issue Nov 7, 2024 · 9 comments · Fixed by #4822
Assignees
Labels
bug Something isn't working fix-me Attempt to fix this issue with OpenHands

Comments

@Tomlili43
Copy link

Tomlili43 commented Nov 7, 2024

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Describe the bug and reproduction steps

configuration:
image

litellm doesn't support function calling model from OpenRouter. bug cause codeactagent couldn't interact with internet solely without ask browser agent for help

I'm using model name anthropic/claude-3.5-sonnet

OpenHands Installation

Docker command in README

OpenHands Version

main

Operating System

MacOS

Logs, Errors, Screenshots, and Additional Context

No response

@Tomlili43 Tomlili43 added the bug Something isn't working label Nov 7, 2024
@xingyaoww
Copy link
Collaborator

xingyaoww commented Nov 7, 2024

A quick hacky way for solving this problem for now would be adding claude-3.5-sonnet directly to FUNCTION_CALLING_SUPPORTED_MODELS in llm.py. We can also cleanup the other two sonnet with suffix there.

@xingyaoww xingyaoww added the fix-me Attempt to fix this issue with OpenHands label Nov 7, 2024
Copy link
Contributor

github-actions bot commented Nov 7, 2024

OpenHands started fixing the issue! You can monitor the progress here.

Copy link
Contributor

github-actions bot commented Nov 7, 2024

A potential fix has been generated and a draft PR #4822 has been created. Please review the changes.

@FarVision2
Copy link

I was getting this from OpenAI, OpenRouter, and Google. The agent was not able to generate anything at all with 0.12

03:03:37 - openhands:WARNING: codeact_agent.py:101 - Function calling not supported for model openai/gpt-4o-mini. Disabling function calling.

03:10:05 - openhands:WARNING: codeact_agent.py:101 - Function calling not supported for model gemini/gemini-1.5-flash-002. Disabling function calling.

04:12:39 - openhands:WARNING: codeact_agent.py:101 - Function calling not supported for model openrouter/deepseek/deepseek-coder. Disabling function calling.

@enyst
Copy link
Collaborator

enyst commented Nov 13, 2024

@FarVision2 At the moment, we only support Anthropic and GPT-4o models for function calling. The rest have function calling disabled, but they should work just fine with an older variant of our code.

Could you tell if you got any errors or anything that may help us figure out what wasn't working? They should be working.

@enyst enyst self-assigned this Nov 13, 2024
@FarVision2
Copy link

Who's 'we'? I have been using the project since it was OpenDevin and did not have a point release until the rename and investment. I'm glad the three are finally getting serious about it, but I got much more done on .09.

LiteLLM has always supported OpenRouter function calling.
https://docs.litellm.ai/docs/providers/openrouter

I have used many Open Router tool-calling models for testing in this project. Even when you had to manually chmod +x the work directory manually :)

LiteLLM supports Gemini
https://docs.litellm.ai/docs/providers/gemini

Anthropic has been low-tier for months.

I would use GPT-4o for the orchestrator maybe but for sub-agent processing makes no sense whatsoever.

https://openrouter.ai/qwen/qwen-2.5-coder-32b-instruct came out a few days ago and I am looking forward to putting it through its paces.

gemini-1.5-Flash-002 is the current gold standard for all of my agentic processing via other projects.

https://artificialanalysis.ai/models
https://openrouter.ai/rankings/programming/scripting?view=week

Even captured API projects like AutoGen MagenticOne can delegate to lower-cost apis like 4o-mini.

Maybe it was a bug and I will try .13 later today. If the project is surreptitiously reducing compatibility in order to drive the Enterprise front end for people who don't know any better that would be unfortunate and I would have to move on.

To directly answer your question it was a new directory with just the docker pull and no manual editing in the IDE.

@xingyaoww
Copy link
Collaborator

@FarVision2 In this PR we just merged, we evaluated a bunch of different models on SWE-Bench and find that not all models "that supports function calling" would work well on the current framework:

image

Based on these results, we currently only enable default function calling for OpenAI and Anthropic, and use mocked version of function calling for other providers.

For example, Gemini-series model have some bug where function calling would suddenly stopped working when conversation gets longer: #4711 (comment)

We also tested Qwen 2.5 Coder 32B instruct (through OpenRouter) on SWE-Bench Lite as well, but the result were pretty low (3.33%). I look into some trajectories, it seems to get into loops and then start to output garbage at some point, not sure if it is a LLM hosting bug or something else.
image

@enyst
Copy link
Collaborator

enyst commented Nov 14, 2024

@FarVision2 Sorry for the misunderstanding. Let me clarify:

  • we, the openhands project, are compatible with all LLMs supported by liteLLM. That does not change.
  • until fairly recently, we were using all these models without using their function calling feature. They supported it, but we didn't make use of it. We had our own implementation of "actions" (the rough equivalent of functions) and always told the LLM how to use them.
  • we have introduced function calling for several models, these. We are still testing and fixing compatibility of other models.

We are not decreasing compatibility, to the contrary, I'd say increasing it. Because we just didn't have function calling at all, and now we do.

Moreover, we are heading towards making it default, so the meaning of compatibility becomes keeping a supporting layer for the others, for the models that don't have function calling or litellm's layer doesn't work fully yet.

The warning messages you see only mean that this feature is not used, and openhands falls back to its own "actions" implementation, instead. That should be fine since it is how it always worked!

If it doesn't, can you please explain some more what doesn't work, what does it do, what do the logs say? Please note that the docker pull command in the readme has a new parameter -e LOG_ALL_EVENTS=true \ to log events.

@xingyaoww
Copy link
Collaborator

xingyaoww commented Nov 14, 2024

And to add - most of the current LLM evaluation on programming doesn't really correlate well on "how well it would work on OpenHands".

Most of these programming evaluation (e.g., HumanEval) are mostly single-turn, where the model just need to produce a block of code given an instruction. But OpenHands as an agent - really requires the model to be very good at multi-turn interaction, which most LLM provider today doesn't measure, so we have to run SWE-Bench evaluation on these models by ourselves. So far it has served the purpose well, and the score has been pretty correlated with human judgement on agent quality.

xingyaoww added a commit that referenced this issue Nov 25, 2024
…l from OpenRouter. bug cause codeactagent couldn't interact with internet solely without ask browser agent for help (#4822)

Co-authored-by: Xingyao Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fix-me Attempt to fix this issue with OpenHands
Projects
None yet
4 participants