Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues on retail tasks #3

Open
dayyyyyyyyyy opened this issue Jul 10, 2024 · 1 comment
Open

Issues on retail tasks #3

dayyyyyyyyyy opened this issue Jul 10, 2024 · 1 comment

Comments

@dayyyyyyyyyy
Copy link

No description provided.

@dayyyyyyyyyy
Copy link
Author

Hello, we have run and analyzed the retail tasks using GPT-4o. Below are some comments and questions about tau-bench.

  • There were multiple problems arising from the fact that the user is also a LLM, not an actual human (We know it is impossible to hire a human user). The most common case is that the user does not insist with the instruction and is often swayed with what the assistant says. For example, when the assistant recommends an item which the user wasn’t initially looking for, the user still says yes. Even when the assistant shows incorrect information about the user, the user always confirms (which is rarely happened in real-world scenarios) The confirmation process doesn’t seem to be working as intended on the user side.

  • We also found out that the wiki policy is missing some critical information about the retail system. Below are some additional system prompts that might be helpful. These policies are retail-specific, and we are also very careful about making the system prompt to become too much ad-hoc to tau-bench.

    • It is impossible to cancel partial items from a pending order. Cancellation is only done for the entire order. When asking user for cancel confirmation, always remind this information to the user and make sure the user is willing to cancel the whole order.
    • Modification tool requires the number of current items and new items to be equal. This means increasing the number of items or cancelling specific items via modification tool are not supported.
    • After return, the order status will be changed to ‘return requested’ even though the user has returned only partial items from the order. This means that returns or exchanges are no longer allowed upon the order.
    • After exchange, the order status will be changed to ‘exchange requested’ even though the user has exchanged only partial items from the order. This means that returns or exchanges are no longer allowed upon the order.
    • When ‘address 2’ section of user address is not applicable, always leave it as a blank string. Do not write “Not Applicable”, “NA”, or any other descriptions.
  • We have a question about the orders.json file: Are the orders placed in chronological order? Since retail task #59 asks to cancel the older one among two pending orders, and the order data doesn't seem to contain metadata for date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant