Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split bash commands by the new line character #4462

Merged
merged 30 commits into from
Oct 24, 2024

Conversation

tofarr
Copy link
Collaborator

@tofarr tofarr commented Oct 17, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

When a bash command spans multiple lines the output is really confusing - so we split each command by the newline character for better readability. My initial approach was to split the command into multiple separate commands and add these to the event stream individually, but this did not play well with the LLM - The behavior was non-deterministic, but sometimes it would basically say "Thank you for running the first command - what about the others?" And it would supply the commands that had not yet been run to be run again, resulting in duplicates in the stream!

So my approach was to improve the output from splitting commands in the runtime client, while still treating these as a single RunCmdAction

Example 1:

Print the word "OpenHands" in the bash console in ascii art. Use only the echo command to do this. New lines of text should use the same invocation of the echo command

image

Example 2

Please run the following commands all at once:

* ls -lah
* echo "May the force be with you!"
* echo "May the odds be ever in your favor!"
* echo "Live long and prosper!"

image

Example 3

Please run the following commands all at once:

* ls -lah
* echo "hello world"
* echo a sample kubernetes deployment yaml into deploy.yaml

(This looks a little uglier because the console interprets each tab as 8 spaces.)
image

Example 4 (With confirmation mode!)

Please do the following in a single bash command:

* sleep for 5 seconds
* print "You must Narfle the Garthok!"
* Print the current working directory

image

Example 5 (Running user commands)

image

@tofarr tofarr force-pushed the feat_split_commands_by_new_line branch from e4b2264 to 5b62bff Compare October 17, 2024 20:26
@tofarr tofarr marked this pull request as ready for review October 17, 2024 20:44
Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually something like this on the backend - what do you feel if we move them to the controller here? That is, we break one CmdRunAction to multiple ones and send them over for execution in agent controller?

commands = split_bash_commands(action.command)
all_output = ''
for command in commands:
if command == '':
output, exit_code = self._continue_bash(

new_action.thought = ''
else:
new_action.thought = action.thought
self.event_stream.add_event(new_action, EventSource.AGENT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea, how will it work with longer running commands though? Here _pending_action is set. When the first new action is executed the runtime will put an obs in the stream which will make it unset here... this places them in the stream and continues looping here. Then the step would pass here before all are executed, no? What prevents it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this probably messes with _pending_action quite a bit....

@tofarr you can probably test this by asking the agent something like:

Please run echo hello and ls -lah in the same <execute_bash> block

enyst
enyst previously requested changes Oct 18, 2024
Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept thinking about this PR since the other day, for some odd reason. I'd appreciate if we take some time to make sure this could work. It seems to me it won't work right or I just don't see why it would.

For added fun, let's take a delegates' case: if the action is split in 5 in a delegate controller, and mini-action 1 of 5 has an error, then... okay, I think something else is bugged currently, but even with a fix, say like this, the delegate ends here. In this PR, is it possible to have obs 2-5 arrive in the stream after the delegate ends? If not, can you please point out what would prevent it?

Because it seems like the child obs could end up in the parent's history, which is a tad messy. 😅

continue
new_action = CmdRunAction(command=cmd)
if i < len(commands) - 1:
new_action.thought = ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from the original prompt:
.. Each instance should have a blank 'thought' attribute, except for the last one which should have the thought from the original action...

The reason for this is that it prevents the frontend UI from printing the thought multiple times. e.g.: (Without this functionality):
image

@enyst
Copy link
Collaborator

enyst commented Oct 18, 2024

It is possible that the action includes several times the same bash command, right? Like echo *, I don't know.

I wonder what would happen here. If they're separate in the stream, like mini-action 1, mini-obs 1, mini-action 2... it seems like they can be detected as repeated actions / obs, and the agent should stop with stuck in a loop error. 🤔 We do this for some cases when the LLM gets trapped and responds with the same thing over and over.

@rbren
Copy link
Collaborator

rbren commented Oct 18, 2024

It is possible that the action includes several times the same bash command, right? Like echo *, I don't know.

Yeah this is possible, but I can't imagine a scenario where the agent would want to do this :)

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

So this seems to work on the backend, but the frontend isn't splitting things up correctly

Screenshot 2024-10-21 at 10 51 09 AM

here's my sample prompt:

please run the following commands all at once:

* ls -lah
* echo "hello world"
* echo a sample kubernetes deployment yaml into deploy.yaml

it seems like we get all the CommandRunActions first, then we get all the Observations. Instead, we should get them collated properly

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

also notice the mangled YAML content on the last command there--not sure what the issue could be...

@tofarr tofarr requested review from rbren, enyst and xingyaoww October 21, 2024 15:43
@tofarr
Copy link
Collaborator Author

tofarr commented Oct 21, 2024

The command posted by @rbren seems to be working now - though the bash console seems to turn tabs into 8 spaces, which makes it hard to read:
image

@tofarr
Copy link
Collaborator Author

tofarr commented Oct 21, 2024

Confirmation mode is now behaving appropriately too:
image

@rbren
Copy link
Collaborator

rbren commented Oct 21, 2024

@tofarr looks like it's still not working right based on your screenshot. The console in the UI should show

$ ls -lah
./foo
./bar
$ echo "hello world"
hello world

etc. I.e. we should see the output of one command before the start of the next

message.args.is_confirmed !== "rejected"
) {
store.dispatch(appendInput(message.args.command));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commands are only outputted on completion

@@ -13,6 +13,7 @@ class CmdOutputObservation(Observation):
exit_code: int = 0
hidden: bool = False
observation: str = ObservationType.RUN
meta: str = ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a better name for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps location, or loc, or current_dir?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to interpreter_details

@rbren
Copy link
Collaborator

rbren commented Oct 24, 2024

@enyst we went with a very different implementation here if you wouldn't mind taking another look and unblocking

@xingyaoww would like your approval here as well

@tofarr tofarr dismissed enyst’s stale review October 24, 2024 13:43

Enysts comments were valid and have been addressed in subsequent edits. We no longer split the commands in the way that was worrisome

@tofarr tofarr merged commit 90e2bf4 into main Oct 24, 2024
@tofarr tofarr deleted the feat_split_commands_by_new_line branch October 24, 2024 13:44
@enyst
Copy link
Collaborator

enyst commented Oct 24, 2024

I agree my concerns were addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants