Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BFCL] Multi-turn query missing info & incorrect label #771

Open
mingzhu0527 opened this issue Nov 18, 2024 · 3 comments
Open

[BFCL] Multi-turn query missing info & incorrect label #771

mingzhu0527 opened this issue Nov 18, 2024 · 3 comments

Comments

@mingzhu0527
Copy link

Describe the issue
Hi, recently we reviewed some multi-turn data from BFCL-v3, and noticed a few cases where there's not enough information in the query to complete the task, or the groundtruth is incorrect.

ID datapoint

ID Query GroundTruth Answer Comment
multi_turn_base_1 I am alex. Check if the current directory is under my name and list all the visible and hidden contents in the current directory now, please. cd(folder='workspace')ls(a=True) 'workspace' not mentioned in query
multi_turn_base_2 Go into document folder and Could you draft up a create a document titled 'TeamNotes.txt' for keeping track of all the fresh ideas? cd(folder='documents')touch(file_name='TeamNotes.txt') Query use 'document' but ground
multi_turn_base_3 As part of my latest photography project, I need to gather files that have 'test' in their name from a specific folder in my current directory. Could you help me locate those? find(path='.',name='test') Query missing info
multi_turn_base_4 Could you kindly place the report in the /tmp directory or show me the list of files in tmp if already there? pwd()ls() GroundTruth error
multi_turn_base_11 Display all the available files located within the '/temp' directory using the terminal interface. pwd()ls(a=True) GroundTruth error
multi_turn_base_18 Copy the txt contents of the 'Quarter1_Reports' directory and place it in a new directory naming it 'Archived_Quarter1. mkdir(dir_name='Archived_Quarter1')cp(source='report1.txt',destination='Archived_Quarter1')cp(source='report2.txt',destination='Archived_Quarter1')cp(source='History101.txt',destination='Archived_Quarter1')cp(source='History202.txt',destination='Archived_Quarter1') GroundTruth Error
@Fanjia-Yan
Copy link
Collaborator

Fanjia-Yan commented Nov 18, 2024

Hi,

Thank you for raising the issues above. We might have addressed some of the issues above in #740 but let me break it down individually:

  1. multi_turn_base_1: This has been fixed in [BFCL Dataset Revamp 1/n] Multi-Turn (Part 1) #740, the new ground truth is ["ls(a=True)"]. The model needs to list the current path and after ensuring that current dir is Alex, the model will proceed to list everything in current directory.
  2. multi_turn_base_2: Inside current working directory, there is a documents folder while the question mentions document only. We expect model to list the current directory and understand the possible options then make a choice.
  3. multi_turn_base_3: This is our data labelling issue, we will modify the question with an explicit directory name
  4. multi_turn_base_4: The model should make some exploration to file system and discover that the current directory is already the tmp folder
  5. multi_turn_base_11: /temp should signify root directory, which is where the user initially configured to reside
  6. multi_turn_base_18: I do see some ambiguity here. Our current working directory is Quarter1_Reports with 4 txt files within and the question asks to copy all text file to a new directory but not explicitly mentioning where it resides and ground truth assumes it's the current working directory. Is this the ground truth error you are talking about?

Thank you!

@mingzhu0527
Copy link
Author

Thank you for your reply!
6. multi_turn_base_18: The groundtruth error I'm talking about is that, the query says "Copy the txt contents of the 'Quarter1_Reports' directory", but the groundtruth did not first navigate to Quarter1_Reports or check what is the current directory.
You mentioned that "Our current working directory is Quarter1_Reports", but this information is not included in either the query or the system prompt, and because of which the model could not figure out. I wonder if it is supposed to be this way.

@Fanjia-Yan
Copy link
Collaborator

Fanjia-Yan commented Nov 28, 2024

Yes, the Quarter1_Reports directory location is nowhere mentioned in the provided context. However, the model is able to obtain the information via ls or pwd to collect this layer of information if deemed necessary. We want the model to explore but we don't judge model's exploration or failure attempt, which is why we don't include this in ground truth.

We judge a data entry accuracy by 2 criteria:

  1. Final state exact match
  2. Minimal function calling steps to achieve all user queries(as annotated in ground truth)
    There are steps such as pwd which is an exploratory step and is not required to complete the task or affect the ground truth, and therefore we do not include in ground truth.

Thanks for bringing this up and sorry for the late reply. We will also document this procedure in a clearer manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants