add_thinking #5220

Tomlili43 · 2024-11-23T08:18:55Z

End-user friendly description of the problem this fixes or functionality that this introduces
Agent thinking feature by thinking

Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

Give a summary of what the PR does, explaining any non-trivial design decisions
add thinking prompt and thinking front end UI.

Link of any specific issues this addresses

neubig

Hi @Tomlili43 , thanks for the contribution!

In general, all prompts should be implemented in the Python backend, not the frontend. And if we add a new prompt we'll need to validate its effect on agent accuracy and cost.

If you're interested in adding this prompt, could you try adding it in the agenthub/codeact_agent directory? (you can look at the directory structure there to understand what needs to be done)

We would then need to validate the effect of this change on performance and cost. I'm a bit worried that adding this big prompt in the input and encouraging longer outputs would lead to significant increases in time/token cost, but we could try a few examples to see if this makes a big difference in accuracy!

Tomlili43 · 2024-11-23T14:37:13Z

@neubig yeap,

added it in the agenthub/codeact_agent directory.
for validation, it will cost so many tokens. To be honest , i couldn't afford it. Can you help with it ?

neubig · 2024-11-23T15:50:09Z

Hey @Tomlili43 , we can try to do this, but it'd be good if you could try running a few examples first to see if it's working!

Tomlili43 · 2024-11-24T08:45:59Z

@neubig hi, validation == run cmd: ./evaluation/agent_bench/scripts/run_infer.sh eval_thinking_prompt_llm HEAD CodeActAgent 1
right ?

neubig · 2024-11-27T14:21:16Z

Yep, that looks right.

mamoodi · 2024-12-23T14:20:13Z

@Tomlili43 is this something you still wish to pursue?

mamoodi · 2024-12-30T15:24:30Z

Going to close this. Please let me know if this was still in progress and we can reopen.

add_thinking

36b0688

neubig requested changes Nov 23, 2024

View reviewed changes

Tomlili43 added 2 commits November 23, 2024 22:27

add_thinking

7e94134

rm_useless_file

addee32

neubig added the run-eval-s Runs evaluation with 5 instances label Nov 23, 2024

neubig mentioned this pull request Nov 23, 2024

[Bug]: run-eval workflow is failing on git pull (due to fork?) #5228

Open

1 task

mamoodi closed this Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add_thinking #5220

add_thinking #5220

Tomlili43 commented Nov 23, 2024

neubig left a comment

Tomlili43 commented Nov 23, 2024

neubig commented Nov 23, 2024

Tomlili43 commented Nov 24, 2024

neubig commented Nov 27, 2024

mamoodi commented Dec 23, 2024

mamoodi commented Dec 30, 2024

add_thinking #5220

add_thinking #5220

Conversation

Tomlili43 commented Nov 23, 2024

neubig left a comment

Choose a reason for hiding this comment

Tomlili43 commented Nov 23, 2024

neubig commented Nov 23, 2024

Tomlili43 commented Nov 24, 2024

neubig commented Nov 27, 2024

mamoodi commented Dec 23, 2024

mamoodi commented Dec 30, 2024