Clarifications on constructing user instructions #8

BillyZhang24kobe · 2024-10-14T03:56:41Z

Hi,

Thanks a lot for building such an amazing benchmark. I have a question related to the process you used to create the user instructions for each task. In your paper, specifically in section 4 (i.e. Benchmark Construction) in Stage III, you mentioned "we write an initial user instruction, run a trial with gpt-4-turbo function calling agent, polish the user instruction by examining the trajectory, and do this iteratively until we are certain no ambiguities exist". I wonder if there are any attribute formats you followed when you constructed these user instructions?

Take this instruction as an example: You are aarav-garcia-1177. For your upcoming trip from ATL to PHL, you want to change for the cheapest economy flight and for the day after the original reservation. You are happy with original payment for refund. It seems that the user instruction always follows this attribute format: user identity (aarav-garcia-1177), goal (For your upcoming trip from ATL to PHL, you want to change for the cheapest economy flight and for the day after the original reservation), and preferences (You are happy with original payment for refund.). Sometimes you also add the language style in the instruction (e.g. You are reactive to the agent and will not say anything that is not asked.). Could you please clarify if you ever considered any attribute formats when you created the user instructions for the tasks in tau-bench? If so, would you mind sharing more details on these attribute formats considered during benchmark construction?

Thank you so much in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarifications on constructing user instructions #8

Clarifications on constructing user instructions #8

BillyZhang24kobe commented Oct 14, 2024

Clarifications on constructing user instructions #8

Clarifications on constructing user instructions #8

Comments

BillyZhang24kobe commented Oct 14, 2024