Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify fn calling usage #6596

Merged
merged 6 commits into from
Feb 4, 2025
Merged

Simplify fn calling usage #6596

merged 6 commits into from
Feb 4, 2025

Conversation

enyst
Copy link
Collaborator

@enyst enyst commented Feb 3, 2025

Give a summary of what the PR does, explaining any non-trivial design decisions

This PR proposes to clean up mock_function_calling var from the agent. The llm knows if it supports it.
It's just a little clean-up, to help us read this code (part of refactoring towards message <-> event)


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:cdb9248-nikolaik   --name openhands-app-cdb9248   docker.all-hands.dev/all-hands-ai/openhands:cdb9248

@enyst enyst marked this pull request as draft February 3, 2025 20:16
@enyst enyst marked this pull request as ready for review February 3, 2025 21:38
f'Function calling not enabled for model {self.llm.config.model}. '
'Mocking function calling via prompting.'
)
self.mock_function_calling = True
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really used in the agent, it's used in llm.py

@enyst enyst changed the title Refactor messages Simplify fn calling usage Feb 3, 2025
@enyst enyst requested a review from xingyaoww February 3, 2025 21:51
@enyst enyst requested a review from csmith49 February 4, 2025 19:26
Copy link
Contributor

github-actions bot commented Feb 4, 2025

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

Copy link
Contributor

github-actions bot commented Feb 4, 2025

Trigger by: Pull Request (integration-test label on PR #6596)
Commit: 03f49ec
Integration Tests Report (Haiku)
Haiku LLM Test Results:
Success rate: 100.00% (7/7)

Total cost: USD 0.08

instance_id success reason cost error_message
t07_interactive_commands True 0.005 nan
t04_git_staging True 0.006 nan
t01_fix_simple_typo True 0.015 nan
t03_jupyter_write_file True 0.007 nan
t02_add_bash_hello True 0.007 nan
t06_github_pr_browsing True 0.031 nan
t05_simple_browsing True 0.007 nan

Integration Tests Report (DeepSeek)
DeepSeek LLM Test Results:
Success rate: 85.71% (6/7)

Total cost: USD 0.02

instance_id success reason cost error_message
t05_simple_browsing False The answer is not found in any message. Total messages: 2. 0.003 nan
t04_git_staging True 0.002 nan
t03_jupyter_write_file True 0.002 nan
t01_fix_simple_typo True 0.003 nan
t02_add_bash_hello True 0.002 nan
t06_github_pr_browsing True 0.005 nan
t07_interactive_commands True 0.003 nan

Integration Tests Report Delegator (Haiku)
Success rate: 50.00% (1/2)

Total cost: USD 0.05

instance_id success reason cost error_message
t02_add_bash_hello True 0.022 nan
t01_fix_simple_typo False File not fixed: This is a text with a typo. 0.028 nan
Really?
No more errors!
Enjoy!

Integration Tests Report Delegator (DeepSeek)
Success rate: 100.00% (2/2)

Total cost: USD 0.00

instance_id success reason cost error_message
t01_fix_simple_typo True 0.001 nan
t02_add_bash_hello True 0.002 nan

Integration Tests Report VisualBrowsing (DeepSeek)
Success rate: 100.00% (1/1)

Total cost: USD 0.00

instance_id success reason cost error_message
t05_simple_browsing True 0.001 nan

Download testing outputs (includes both Haiku and DeepSeek results): Download

@enyst enyst merged commit 0d312a6 into main Feb 4, 2025
21 checks passed
@enyst enyst deleted the enyst/refactor-messages branch February 4, 2025 21:54
adityasoni9998 pushed a commit to adityasoni9998/OpenHands that referenced this pull request Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants