-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prompt format for multi-step set up #11
Comments
Hi! Here is a pseudocode for the multi step prompt logic: # To predict third action
messages.append({
"role": "user",
"content": [
{
"type": "text",
"text": PROMPT_FOR_COMPUTER + f"{instruction}"
},
{
"type": "image_url",
"image_url": screenshot_from_init
},
{
"type": "text",
"text": previous_actions[0],
},
{
"type": "image_url",
"image_url": screenshot_from_state_0
},
{
"type": "text",
"text": previous_actions[1],
},
{
"type": "image_url",
"image_url": screenshot_from_state_1
}
],
}) Note that we apply the 'history 5' logic for multi step online tasks, as discussed in the report. |
Congrats on the great work and thanks for the comments. |
Hi @llajan Did you try:
from: |
That seems to do the job. Thank you! |
Hi there,
Congratulations on the great work!
I'm curious how should one format the prompt in agent evaluation? i.e. when there are multiple turns of user provided observations and agent actions.
Currently I tried the format below and tested a few tasks on OSWorld, however the results don't look good. The PROMPT_FOR_COMPUTER is just the prompt provided in the readme. So basically I only used the most recent one screenshot and condensed all history actions in the user turn as well.
Could you please share some insights here? Thank you!
The text was updated successfully, but these errors were encountered: