-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trajectory replay: Fix a few corner cases #6380
Conversation
Looking at
A MessageAction with null content, null image, and wait_for_response = true ? Aaahh I think I see how that happened, you literally said it, you added something. The previous is a MessageAction with content where the agent is asking the user a question, but its wait_for_response = false... because this PR is setting it false, right? |
That's correct! |
Aside, I do realize this would become a bug farm... and I'll make sure to add some E2E tests before checking in the user-facing replay functionality in #6348 |
@@ -27,7 +51,6 @@ def _replayable(self) -> bool: | |||
self.replay_events is not None | |||
and self.replay_index < len(self.replay_events) | |||
and isinstance(self.replay_events[self.replay_index], Action) | |||
and self.replay_events[self.replay_index].source != EventSource.USER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not a 'review' comment, and this is cool anyway)
Just the other day, I was playing with Gemini-2.0-thinking, and it's been a lot of fun for coding-adjacent tasks! Among others, it explored a lot of openhands
repo, tracked down every occurrence of oh_action
and followed the execution flow up in frontend, downstream in backend, until it figured out everything about them. It makes itself mini-plans on the fly and does follow up, very cool!
Anyway, so in the server
, all those are set with source USER, but they're quite different, e.g. agent change actions, prompt confirmations, CmdRunActions (ran by user in terminal), MessageActions. I think none should be a problem, and cmd run actions are good for replay! We do want to replay those, if we want to achieve a similar state (hopefully), and of course, they'd be in context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I did this check as a hack at the beginning - probably just to work around the wait_for_confirmation thing. It's been more and more clear that source USER events should be replayed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this feature is a thing of beauty!
End-user friendly description of the problem this fixes or functionality that this introduces
Fix two corner cases handling in trajectory replay feature.
Give a summary of what the PR does, explaining any non-trivial design decisions
Two corner cases were missing in the previous PR #6215:
wait_for_response
message, replay gets stuck, waiting for user's response, which doesn't make sense when in the middle of a replay. This is demonstrated in demo2.json and demo3.json.demo1.json - GUI mode: downloaded from web GUI
demo2.json - Headless mode: after demo1 replay, add a user message, and finish
demo3.json - Headless mode: a replay of demo2. Note: demo2.json and demo3.json only differ in step id, timestamp, hostname, and
wait_for_response
attribute.Link of any specific issues this addresses
Part of #6049
To run this PR locally, use the following command: