Simplify fn calling usage #6596

enyst · 2025-02-03T20:16:41Z

Give a summary of what the PR does, explaining any non-trivial design decisions

This PR proposes to clean up mock_function_calling var from the agent. The llm knows if it supports it.
It's just a little clean-up, to help us read this code (part of refactoring towards message <-> event)

To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:cdb9248-nikolaik   --name openhands-app-cdb9248   docker.all-hands.dev/all-hands-ai/openhands:cdb9248

enyst · 2025-02-03T21:42:03Z

openhands/agenthub/codeact_agent/codeact_agent.py

-                f'Function calling not enabled for model {self.llm.config.model}. '
-                'Mocking function calling via prompting.'
-            )
-            self.mock_function_calling = True


This isn't really used in the agent, it's used in llm.py

github-actions · 2025-02-04T20:41:11Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-02-04T21:33:11Z

Trigger by: Pull Request (integration-test label on PR #6596)
Commit: 03f49ec
Integration Tests Report (Haiku)
Haiku LLM Test Results:
Success rate: 100.00% (7/7)

Total cost: USD 0.08

instance_id	success	cost	error_message
t07_interactive_commands	True	0.005	nan
t04_git_staging	True	0.006	nan
t01_fix_simple_typo	True	0.015	nan
t03_jupyter_write_file	True	0.007	nan
t02_add_bash_hello	True	0.007	nan
t06_github_pr_browsing	True	0.031	nan
t05_simple_browsing	True	0.007	nan

Integration Tests Report (DeepSeek)
DeepSeek LLM Test Results:
Success rate: 85.71% (6/7)

Total cost: USD 0.02

instance_id	success	reason	cost	error_message
t05_simple_browsing	False	The answer is not found in any message. Total messages: 2.	0.003	nan
t04_git_staging	True		0.002	nan
t03_jupyter_write_file	True		0.002	nan
t01_fix_simple_typo	True		0.003	nan
t02_add_bash_hello	True		0.002	nan
t06_github_pr_browsing	True		0.005	nan
t07_interactive_commands	True		0.003	nan

Integration Tests Report Delegator (Haiku)
Success rate: 50.00% (1/2)

Total cost: USD 0.05

instance_id	success	reason	cost	error_message
t02_add_bash_hello	True		0.022	nan
t01_fix_simple_typo	False	File not fixed: This is a text with a typo.	0.028	nan
		Really?
		No more errors!
		Enjoy!

Integration Tests Report Delegator (DeepSeek)
Success rate: 100.00% (2/2)

Total cost: USD 0.00

instance_id	success	reason	cost	error_message
t01_fix_simple_typo	True		0.001	nan
t02_add_bash_hello	True		0.002	nan

Integration Tests Report VisualBrowsing (DeepSeek)
Success rate: 100.00% (1/1)

Total cost: USD 0.00

instance_id	success	reason	cost	error_message
t05_simple_browsing	True		0.001	nan

Download testing outputs (includes both Haiku and DeepSeek results): Download

enyst added 3 commits February 3, 2025 20:30

clean up check

f693255

small clean up agent, llm

2bdf4f2

fix response, add comments

f79f618

enyst marked this pull request as draft February 3, 2025 20:16

fix var

698d9ef

enyst marked this pull request as ready for review February 3, 2025 21:38

enyst commented Feb 3, 2025

View reviewed changes

enyst changed the title ~~Refactor messages~~ Simplify fn calling usage Feb 3, 2025

revert stop words

866c107

enyst requested a review from xingyaoww February 3, 2025 21:51

rename method

cdb9248

enyst requested a review from csmith49 February 4, 2025 19:26

csmith49 approved these changes Feb 4, 2025

View reviewed changes

enyst added the integration-test label Feb 4, 2025

xingyaoww approved these changes Feb 4, 2025

View reviewed changes

enyst merged commit 0d312a6 into main Feb 4, 2025
21 checks passed

enyst deleted the enyst/refactor-messages branch February 4, 2025 21:54

adityasoni9998 pushed a commit to adityasoni9998/OpenHands that referenced this pull request Feb 7, 2025

Simplify fn calling usage (All-Hands-AI#6596)

887a4bf

enyst mentioned this pull request Feb 7, 2025

Visual browsing in CodeAct using set-of-marks annotated webpage screenshots #6464

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify fn calling usage #6596

Simplify fn calling usage #6596

enyst commented Feb 3, 2025 •

edited

Loading

enyst Feb 3, 2025

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

Simplify fn calling usage #6596

Simplify fn calling usage #6596

Conversation

enyst commented Feb 3, 2025 • edited Loading

enyst Feb 3, 2025

Choose a reason for hiding this comment

github-actions bot commented Feb 4, 2025

github-actions bot commented Feb 4, 2025

enyst commented Feb 3, 2025 •

edited

Loading