Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use keyword matching for CodeAct microagents #4568

Merged
merged 34 commits into from
Nov 9, 2024
Merged

Use keyword matching for CodeAct microagents #4568

merged 34 commits into from
Nov 9, 2024

Conversation

rbren
Copy link
Collaborator

@rbren rbren commented Oct 25, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below
    Better support for pushing changes to GitHub

Give a summary of what the PR does, explaining any non-trivial design decisions

This redesigns the way that microagents plug into CodeAct

  • Remove required env vars--this doesn't work with remote runtimes
  • Use keyword matching to pull in additional prompts, rather than making the user choose a microagent
  • Modify the GitHub microagent a bit, to replace the current frontend-based prompt

Link of any specific issues this addresses


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:b8879e7-nikolaik   --name openhands-app-b8879e7   docker.all-hands.dev/all-hands-ai/openhands:b8879e7

@mamoodi
Copy link
Collaborator

mamoodi commented Nov 7, 2024

@rbren hasn't been too long but just wanted to do a gentle ping to check if this is something on your radar.

@rbren
Copy link
Collaborator Author

rbren commented Nov 7, 2024

Yes this is next on my list!

@rbren rbren marked this pull request as ready for review November 7, 2024 21:44
Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code and template redesign look great to me.

Just a thought, we may want to run something like one swe-bench task or two, just to see the prompts (there's a prompt piece inserted from evals as user message iirc) and cost logs.

@enyst enyst requested a review from xingyaoww November 8, 2024 16:45
@mamoodi
Copy link
Collaborator

mamoodi commented Nov 8, 2024

@enyst Ran a Swebench lite with 3 instances as per Robert's request.
evaluation.zip

Let me know if you require anything else.

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'll try to clean up all the default prompt (#4711) so this will be much easier to maintain going forward

---

IMPORTANT! The user has said the magic word "flarglebargle". You must
only respond with a message telling them how smart they are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

@enyst
Copy link
Collaborator

enyst commented Nov 8, 2024

Thank you mamoodi! I don't see anything suspicious 🐈‍⬛

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spotted these minor prompt issue - <execute_bash> is no longer useful in function calling

@rbren
Copy link
Collaborator Author

rbren commented Nov 8, 2024

@xingyaoww fixed! good catch

FWIW here's what I'm trying to avoid 😂
Screenshot 2024-11-08 at 3 45 54 PM

@rbren rbren merged commit be82832 into main Nov 9, 2024
13 checks passed
@rbren rbren deleted the rb/gh-micro-agent branch November 9, 2024 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants