Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BFCL-v2] Dataset and Possible Answer Fix #661

Merged
merged 15 commits into from
Oct 16, 2024

Conversation

HuanzhiMao
Copy link
Collaborator

@HuanzhiMao HuanzhiMao commented Sep 27, 2024

Total number of entries affected: 8

  • irrelevance: 1 entries
    • ['196']
  • live_multiple: 547 entries
    • ['0-0-0', '1-0-1', '4-2-1', '8-4-0', '13-4-5', '18-4-10', '23-5-0', '24-5-1', '25-6-0', '31-10-1', '34-11-0', '35-11-1', '39-14-1', '42-16-1', '44-17-0', '46-18-1', '50-20-0', '53-22-0', '55-22-2', '59-22-6', '61-23-0', '63-25-0', '64-26-0', '74-34-0', '81-36-2', '88-38-5', '103-43-1', '115-45-0', '144-56-0', '146-58-0', '150-58-4', '152-58-6', '153-58-7', '169-69-0', '170-70-0', '171-71-0', '174-72-0', '199-90-1', '206-91-0', '207-91-1', '209-91-3', '220-94-2', '224-98-0', '228-102-0', '247-111-0', '259-123-0', '263-126-0', '264-126-1', '265-127-0', '266-127-1', '267-127-2', '268-127-3', '269-127-4', '270-127-5', '271-127-6', '272-127-7', '273-127-8', '274-127-9', '275-127-10', '276-127-11', '277-128-0', '278-128-1', '279-128-2', '280-128-3', '281-128-4', '282-128-5', '283-128-6', '284-128-7', '285-129-0', '286-129-1', '287-129-2', '288-129-3', '289-129-4', '290-129-5', '291-130-0', '292-130-1', '293-130-2', '294-130-3', '295-130-4', '296-130-5', '297-130-6', '298-130-7', '299-130-8', '300-130-9', '301-131-0', '302-131-1', '303-131-2', '304-131-3', '305-131-4', '307-131-6', '308-131-7', '309-131-8', '310-132-0', '311-132-1', '312-132-2', '313-132-3', '314-132-4', '315-132-5', '316-132-6', '317-132-7', '318-132-8', '319-132-9', '320-132-10', '321-132-11', '322-132-12', '323-132-13', '324-132-14', '325-132-15', '326-132-16', '327-132-17', '328-132-18', '329-132-19', '330-132-20', '331-132-21', '332-132-22', '333-132-23', '334-132-24', '335-132-25', '337-133-1', '356-134-1', '357-134-2', '358-134-3', '359-134-4', '360-134-5', '361-134-6', '362-134-7', '363-134-8', '364-134-9', '365-134-10', '366-134-11', '367-134-12', '368-134-13', '369-134-14', '370-134-15', '371-134-16', '372-134-17', '373-134-18', '374-134-19', '375-134-20', '376-135-0', '377-135-1', '378-135-2', '382-137-0', '383-137-1', '384-137-2', '385-137-3', '386-137-4', '387-137-5', '388-137-6', '389-137-7', '390-137-8', '391-137-9', '392-138-0', '393-138-1', '394-138-2', '395-138-3', '396-139-0', '397-139-1', '398-139-2', '399-139-3', '400-139-4', '401-139-5', '404-140-0', '405-140-1', '406-140-2', '407-140-3', '408-140-4', '409-140-5', '410-140-6', '411-141-0', '412-141-1', '413-141-2', '414-141-3', '415-141-4', '416-141-5', '418-141-7', '419-141-8', '420-141-9', '422-141-11', '423-141-12', '424-141-13', '425-141-14', '426-141-15', '427-141-16', '428-141-17', '429-141-18', '430-141-19', '431-141-20', '432-141-21', '433-141-22', '434-142-0', '435-142-1', '436-142-2', '437-142-3', '438-142-4', '439-143-0', '440-144-0', '441-144-1', '442-144-2', '443-144-3', '444-144-4', '445-144-5', '446-144-6', '447-144-7', '448-144-8', '449-145-0', '451-145-2', '452-145-3', '453-145-4', '454-145-5', '455-145-6', '456-145-7', '460-145-11', '462-145-13', '463-145-14', '464-145-15', '465-145-16', '466-145-17', '467-145-18', '468-145-19', '471-145-22', '472-145-23', '473-145-24', '478-146-3', '482-146-7', '485-147-0', '486-147-1', '487-147-2', '488-147-3', '489-147-4', '490-148-0', '491-148-1', '492-148-2', '493-148-3', '494-148-4', '495-148-5', '496-148-6', '497-148-7', '498-148-8', '499-148-9', '500-148-10', '501-148-11', '502-148-12', '503-149-0', '504-149-1', '505-149-2', '506-149-3', '507-149-4', '508-149-5', '509-149-6', '510-149-7', '511-149-8', '512-150-0', '513-150-1', '514-150-2', '515-150-3', '516-150-4', '517-150-5', '518-150-6', '519-150-7', '520-150-8', '521-150-9', '522-150-10', '523-150-11', '524-151-0', '526-151-2', '527-151-3', '528-151-4', '529-151-5', '530-151-6', '531-151-7', '532-151-8', '533-151-9', '534-151-10', '535-151-11', '536-151-12', '537-151-13', '538-152-0', '539-152-1', '540-152-2', '541-152-3', '542-152-4', '543-152-5', '544-152-6', '545-152-7', '546-152-8', '547-152-9', '548-152-10', '549-152-11', '551-153-0', '554-154-0', '555-154-1', '556-154-2', '557-154-3', '559-154-5', '560-155-0', '561-155-1', '562-155-2', '563-155-3', '564-155-4', '565-155-5', '566-155-6', '567-155-7', '568-155-8', '569-155-9', '570-155-10', '571-155-11', '572-155-12', '573-155-13', '594-158-0', '595-158-1', '596-158-2', '597-158-3', '598-158-4', '599-158-5', '600-158-6', '601-158-7', '602-158-8', '603-158-9', '604-158-10', '605-158-11', '606-158-12', '607-159-0', '608-159-1', '609-159-2', '610-159-3', '611-159-4', '612-159-5', '613-159-6', '614-159-7', '615-159-8', '616-159-9', '617-159-10', '618-159-11', '619-159-12', '620-160-0', '632-161-0', '633-161-1', '634-161-2', '635-161-3', '636-161-4', '637-161-5', '638-161-6', '639-161-7', '640-161-8', '641-161-9', '642-161-10', '643-161-11', '644-161-12', '645-161-13', '646-161-14', '647-161-15', '648-161-16', '649-161-17', '650-161-18', '651-161-19', '652-161-20', '653-161-21', '654-161-22', '655-161-23', '656-161-24', '657-161-25', '658-162-0', '659-162-1', '660-162-2', '661-162-3', '662-162-4', '663-162-5', '664-162-6', '665-162-7', '666-162-8', '667-162-9', '668-162-10', '669-162-11', '670-162-12', '671-162-13', '672-162-14', '673-162-15', '674-162-16', '675-163-0', '676-163-1', '677-163-2', '678-163-3', '679-163-4', '680-163-5', '681-163-6', '682-163-7', '683-163-8', '684-164-0', '685-164-1', '686-164-2', '687-164-3', '688-164-4', '689-164-5', '690-164-6', '691-164-7', '692-164-8', '693-164-9', '694-164-10', '695-164-11', '696-164-12', '697-164-13', '698-164-14', '699-164-15', '700-164-16', '701-164-17', '702-164-18', '703-164-19', '704-164-20', '705-164-21', '706-164-22', '707-164-23', '708-164-24', '709-164-25', '710-164-26', '711-164-27', '712-164-28', '713-165-0', '714-165-1', '715-165-2', '716-165-3', '717-165-4', '718-165-5', '719-165-6', '720-165-7', '721-165-8', '722-165-9', '723-165-10', '724-165-11', '729-167-0', '730-167-1', '731-167-2', '732-167-3', '733-167-4', '734-167-5', '735-167-6', '736-167-7', '737-167-8', '738-168-0', '739-168-1', '740-168-2', '741-168-3', '742-168-4', '743-168-5', '744-168-6', '745-169-0', '746-169-1', '747-169-2', '748-169-3', '749-169-4', '750-169-5', '751-169-6', '752-169-7', '753-169-8', '754-169-9', '755-169-10', '756-169-11', '757-169-12', '758-169-13', '759-169-14', '760-169-15', '761-169-16', '778-173-0', '783-173-5', '794-175-0', '795-175-1', '796-175-2', '797-175-3', '798-175-4', '799-175-5', '800-175-6', '801-175-7', '802-175-8', '803-175-9', '804-175-10', '805-175-11', '806-175-12', '807-175-13', '808-175-14', '817-177-0', '818-177-1', '820-177-3', '821-177-4', '824-177-7', '825-178-0', '826-178-1', '827-178-2', '828-178-3', '829-178-4', '830-178-5', '831-178-6', '832-178-7', '833-178-8', '834-178-9', '835-178-10', '836-178-11', '837-178-12', '838-178-13', '839-178-14', '840-178-15', '841-178-16', '842-178-17', '843-178-18', '844-178-19', '845-178-20', '848-179-2', '849-179-3', '852-180-0', '853-180-1', '854-180-2', '855-180-3', '856-180-4', '857-180-5', '858-180-6', '866-182-3', '873-182-10', '875-183-0', '876-183-1', '877-183-2', '878-183-3', '879-183-4', '880-183-5', '881-183-6', '882-183-7', '883-184-0', '884-184-1', '885-184-2', '886-184-3', '887-184-4', '888-184-5', '889-184-6', '895-185-5', '899-185-9', '903-186-0', '904-186-1', '905-186-2', '906-186-3', '907-186-4', '908-187-0', '952-201-0', '966-208-0', '1007-236-0', '1013-242-0', '1015-244-0', '1029-257-0', '1032-260-0', '1034-262-0']
  • live_parallel: 11 entries
    • ['1-0-1', '2-0-2', '3-0-3', '5-2-0', '6-3-0', '7-3-1', '10-6-0', '11-7-0', '13-9-0', '14-10-0', '15-11-0']
  • live_parallel_multiple: 17 entries
    • ['0-0-0', '1-1-0', '4-3-0', '5-4-0', '6-5-0', '8-7-0', '9-8-0', '10-9-0', '11-10-0', '14-12-0', '16-14-0', '18-16-0', '19-16-1', '20-17-0', '21-18-0', '22-19-0', '23-20-0']
  • live_simple: 104 entries
    • ['2-2-0', '3-2-1', '4-3-0', '5-3-1', '6-3-2', '7-3-3', '8-3-4', '9-3-5', '10-3-6', '11-3-7', '12-3-8', '13-3-9', '14-3-10', '15-3-11', '16-3-12', '17-3-13', '18-3-14', '19-3-15', '20-4-0', '21-4-1', '22-5-0', '24-5-2', '30-8-0', '38-15-0', '41-17-1', '49-21-1', '50-22-0', '51-23-0', '53-24-0', '56-26-0', '78-39-0', '79-40-0', '84-45-0', '96-57-0', '97-57-1', '102-61-0', '112-68-0', '114-70-0', '120-76-0', '131-84-1', '141-94-0', '142-94-1', '143-95-0', '144-95-1', '145-95-2', '146-95-3', '147-95-4', '148-95-5', '149-95-6', '150-95-7', '151-95-8', '152-95-9', '153-95-10', '154-95-11', '155-95-12', '156-95-13', '157-95-14', '158-95-15', '159-95-16', '160-95-17', '166-99-0', '167-99-1', '171-99-5', '172-99-6', '173-99-7', '183-108-0', '184-109-0', '190-115-0', '191-115-1', '208-117-0', '209-117-1', '210-117-2', '211-117-3', '212-117-4', '213-117-5', '214-117-6', '215-117-7', '216-117-8', '217-117-9', '218-117-10', '219-117-11', '220-117-12', '221-117-13', '222-117-14', '223-117-15', '224-117-16', '225-117-17', '226-118-0', '227-118-1', '228-119-0', '233-123-0', '234-123-1', '235-124-0', '236-124-1', '237-125-0', '238-125-1', '239-125-2', '240-125-3', '241-125-4', '248-130-0', '250-132-0', '252-134-0', '253-135-0', '256-137-0']
  • parallel_multiple: 2 entries
    • ['83', '94']

This will affect the leaderboard score. We will update it in a separate PR.

We want to thank @budhiraja for pointing out these dataset issues.

@HuanzhiMao HuanzhiMao added the BFCL-Dataset BFCL Dataset-Related Issue label Sep 27, 2024
@budhiraja
Copy link

budhiraja commented Sep 29, 2024

thanks, here are some more ids where Ground truth seems ambiguous. Can you please take a look? happy to accept the review for this one and follow up in next PR if you prefer that

live_simple_117-73-0, live_simple_33-10-0 [these two are a bit more worrysome in terms of prompts],
id live_simple_50-22-0 [models predicting GB are penalized but GB should be correct because that is what mentioned in prompt],
simple_183 [ground truth should allow Santa Clara because that is by default a county? so maybe ground truth is not correct]
simple_375 : location is not required?

HuanzhiMao and others added 5 commits October 6, 2024 00:03
…multiple 0-150, 550-600 (#3)

* Cleaned live parallel, parallel_multiple and simple

* Cleaned Line 0-150 and 550-600 of live_multiple

* Reinstated white spaces

* Edited Linux based instructions

* fix parallel_multiple_10

* Fixed cmd_controller.execute ambiguity

* Remove non-default null from live ground truth files.

* Reinstate unit parameter

---------

Co-authored-by: Huanzhi (Hans) Mao <[email protected]>
* cleaned 150-550 of live_multiple

* removed default parameter in prompts, added space formatting to json

* collapsed json file

* added whitespace

* finished cross check cleaning

* fix

---------

Co-authored-by: AndyChenYH <[email protected]>
Co-authored-by: Fanjia Yan <[email protected]>
Copy link
Collaborator

@CharlieJCJ CharlieJCJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@budhiraja
Copy link

approving it since no major concerns. please update the changelog (in case its not correct) before merging this PR.

@ShishirPatil ShishirPatil changed the title [BFCL] Dataset and Possible Answer Fix [BFCL-v2] Dataset and Possible Answer Fix Oct 16, 2024
@HuanzhiMao
Copy link
Collaborator Author

This PR mainly addresses issues with the Live dataset. The change log has been updated with the latest numbers.

@ShishirPatil ShishirPatil merged commit 57a3400 into ShishirPatil:main Oct 16, 2024
ShishirPatil pushed a commit that referenced this pull request Oct 21, 2024
This PR updates the leaderboard to reflect the change in score due to
the following PR merge:

1. #660 
2. #661
3. #683
4. #679
5. #708 
6. #709
7. #701
8. #657 
9. #658 
10. #640 
11. #653
12. #642 
13. #696 
14. #667

Close #662.

Note: Some models (like `firefunction`, `functionary`,
`microsoft/phi`)are not included in this leaderboard update because we
don't have all the entries generated. We will add them back once we get
the full result generated.
VishnuSuresh27 added a commit to VishnuSuresh27/gorilla that referenced this pull request Nov 11, 2024
Total number of entries affected: 8

- irrelevance: 1 entries
  - `['196']`
- live_multiple: 547 entries
- `['0-0-0', '1-0-1', '4-2-1', '8-4-0', '13-4-5', '18-4-10', '23-5-0',
'24-5-1', '25-6-0', '31-10-1', '34-11-0', '35-11-1', '39-14-1',
'42-16-1', '44-17-0', '46-18-1', '50-20-0', '53-22-0', '55-22-2',
'59-22-6', '61-23-0', '63-25-0', '64-26-0', '74-34-0', '81-36-2',
'88-38-5', '103-43-1', '115-45-0', '144-56-0', '146-58-0', '150-58-4',
'152-58-6', '153-58-7', '169-69-0', '170-70-0', '171-71-0', '174-72-0',
'199-90-1', '206-91-0', '207-91-1', '209-91-3', '220-94-2', '224-98-0',
'228-102-0', '247-111-0', '259-123-0', '263-126-0', '264-126-1',
'265-127-0', '266-127-1', '267-127-2', '268-127-3', '269-127-4',
'270-127-5', '271-127-6', '272-127-7', '273-127-8', '274-127-9',
'275-127-10', '276-127-11', '277-128-0', '278-128-1', '279-128-2',
'280-128-3', '281-128-4', '282-128-5', '283-128-6', '284-128-7',
'285-129-0', '286-129-1', '287-129-2', '288-129-3', '289-129-4',
'290-129-5', '291-130-0', '292-130-1', '293-130-2', '294-130-3',
'295-130-4', '296-130-5', '297-130-6', '298-130-7', '299-130-8',
'300-130-9', '301-131-0', '302-131-1', '303-131-2', '304-131-3',
'305-131-4', '307-131-6', '308-131-7', '309-131-8', '310-132-0',
'311-132-1', '312-132-2', '313-132-3', '314-132-4', '315-132-5',
'316-132-6', '317-132-7', '318-132-8', '319-132-9', '320-132-10',
'321-132-11', '322-132-12', '323-132-13', '324-132-14', '325-132-15',
'326-132-16', '327-132-17', '328-132-18', '329-132-19', '330-132-20',
'331-132-21',
'332-132-22', '333-132-23', '334-132-24', '335-132-25', '337-133-1',
'356-134-1', '357-134-2', '358-134-3', '359-134-4', '360-134-5',
'361-134-6', '362-134-7', '363-134-8', '364-134-9', '365-134-10',
'366-134-11', '367-134-12', '368-134-13', '369-134-14', '370-134-15',
'371-134-16', '372-134-17', '373-134-18', '374-134-19', '375-134-20',
'376-135-0', '377-135-1', '378-135-2', '382-137-0', '383-137-1',
'384-137-2', '385-137-3', '386-137-4', '387-137-5', '388-137-6',
'389-137-7', '390-137-8', '391-137-9', '392-138-0', '393-138-1',
'394-138-2', '395-138-3', '396-139-0', '397-139-1', '398-139-2',
'399-139-3', '400-139-4', '401-139-5', '404-140-0', '405-140-1',
'406-140-2', '407-140-3', '408-140-4', '409-140-5', '410-140-6',
'411-141-0', '412-141-1', '413-141-2', '414-141-3', '415-141-4',
'416-141-5', '418-141-7', '419-141-8', '420-141-9', '422-141-11',
'423-141-12', '424-141-13', '425-141-14', '426-141-15', '427-141-16',
'428-141-17', '429-141-18', '430-141-19', '431-141-20', '432-141-21',
'433-141-22',
'434-142-0', '435-142-1', '436-142-2', '437-142-3', '438-142-4',
'439-143-0', '440-144-0', '441-144-1', '442-144-2', '443-144-3',
'444-144-4', '445-144-5', '446-144-6', '447-144-7', '448-144-8',
'449-145-0', '451-145-2', '452-145-3', '453-145-4', '454-145-5',
'455-145-6', '456-145-7', '460-145-11', '462-145-13', '463-145-14',
'464-145-15', '465-145-16', '466-145-17', '467-145-18', '468-145-19',
'471-145-22', '472-145-23', '473-145-24', '478-146-3', '482-146-7',
'485-147-0', '486-147-1', '487-147-2', '488-147-3', '489-147-4',
'490-148-0', '491-148-1', '492-148-2', '493-148-3', '494-148-4',
'495-148-5',
'496-148-6', '497-148-7', '498-148-8', '499-148-9', '500-148-10',
'501-148-11', '502-148-12', '503-149-0', '504-149-1', '505-149-2',
'506-149-3', '507-149-4', '508-149-5', '509-149-6', '510-149-7',
'511-149-8', '512-150-0', '513-150-1', '514-150-2', '515-150-3',
'516-150-4', '517-150-5', '518-150-6', '519-150-7', '520-150-8',
'521-150-9', '522-150-10', '523-150-11', '524-151-0', '526-151-2',
'527-151-3', '528-151-4', '529-151-5', '530-151-6', '531-151-7',
'532-151-8', '533-151-9', '534-151-10', '535-151-11', '536-151-12',
'537-151-13', '538-152-0', '539-152-1', '540-152-2', '541-152-3',
'542-152-4', '543-152-5', '544-152-6', '545-152-7', '546-152-8',
'547-152-9', '548-152-10', '549-152-11', '551-153-0', '554-154-0',
'555-154-1', '556-154-2', '557-154-3', '559-154-5', '560-155-0',
'561-155-1', '562-155-2', '563-155-3', '564-155-4', '565-155-5',
'566-155-6', '567-155-7', '568-155-8', '569-155-9', '570-155-10',
'571-155-11', '572-155-12', '573-155-13', '594-158-0', '595-158-1',
'596-158-2', '597-158-3', '598-158-4', '599-158-5', '600-158-6',
'601-158-7', '602-158-8', '603-158-9', '604-158-10', '605-158-11',
'606-158-12', '607-159-0', '608-159-1', '609-159-2', '610-159-3',
'611-159-4', '612-159-5', '613-159-6', '614-159-7', '615-159-8',
'616-159-9', '617-159-10', '618-159-11', '619-159-12', '620-160-0',
'632-161-0', '633-161-1', '634-161-2', '635-161-3', '636-161-4',
'637-161-5', '638-161-6', '639-161-7', '640-161-8', '641-161-9',
'642-161-10', '643-161-11', '644-161-12', '645-161-13', '646-161-14',
'647-161-15', '648-161-16', '649-161-17', '650-161-18', '651-161-19',
'652-161-20', '653-161-21', '654-161-22', '655-161-23', '656-161-24',
'657-161-25', '658-162-0', '659-162-1', '660-162-2', '661-162-3',
'662-162-4', '663-162-5', '664-162-6', '665-162-7', '666-162-8',
'667-162-9', '668-162-10', '669-162-11', '670-162-12', '671-162-13',
'672-162-14', '673-162-15', '674-162-16', '675-163-0', '676-163-1',
'677-163-2', '678-163-3', '679-163-4', '680-163-5', '681-163-6',
'682-163-7', '683-163-8', '684-164-0', '685-164-1', '686-164-2',
'687-164-3', '688-164-4', '689-164-5', '690-164-6', '691-164-7',
'692-164-8', '693-164-9', '694-164-10', '695-164-11', '696-164-12',
'697-164-13', '698-164-14', '699-164-15', '700-164-16', '701-164-17',
'702-164-18', '703-164-19', '704-164-20', '705-164-21', '706-164-22',
'707-164-23', '708-164-24', '709-164-25', '710-164-26', '711-164-27',
'712-164-28', '713-165-0', '714-165-1', '715-165-2', '716-165-3',
'717-165-4', '718-165-5', '719-165-6', '720-165-7', '721-165-8',
'722-165-9', '723-165-10', '724-165-11', '729-167-0', '730-167-1',
'731-167-2', '732-167-3', '733-167-4',
'734-167-5', '735-167-6', '736-167-7', '737-167-8', '738-168-0',
'739-168-1', '740-168-2', '741-168-3', '742-168-4', '743-168-5',
'744-168-6', '745-169-0', '746-169-1', '747-169-2', '748-169-3',
'749-169-4', '750-169-5', '751-169-6', '752-169-7', '753-169-8',
'754-169-9', '755-169-10', '756-169-11', '757-169-12', '758-169-13',
'759-169-14', '760-169-15', '761-169-16', '778-173-0', '783-173-5',
'794-175-0', '795-175-1', '796-175-2', '797-175-3', '798-175-4',
'799-175-5', '800-175-6', '801-175-7', '802-175-8', '803-175-9',
'804-175-10', '805-175-11', '806-175-12', '807-175-13', '808-175-14',
'817-177-0', '818-177-1', '820-177-3', '821-177-4', '824-177-7',
'825-178-0', '826-178-1', '827-178-2', '828-178-3', '829-178-4',
'830-178-5', '831-178-6', '832-178-7', '833-178-8', '834-178-9',
'835-178-10', '836-178-11', '837-178-12', '838-178-13', '839-178-14',
'840-178-15', '841-178-16', '842-178-17', '843-178-18', '844-178-19',
'845-178-20', '848-179-2', '849-179-3', '852-180-0', '853-180-1',
'854-180-2', '855-180-3', '856-180-4', '857-180-5', '858-180-6',
'866-182-3', '873-182-10', '875-183-0', '876-183-1', '877-183-2',
'878-183-3', '879-183-4', '880-183-5', '881-183-6', '882-183-7',
'883-184-0', '884-184-1', '885-184-2', '886-184-3', '887-184-4',
'888-184-5', '889-184-6', '895-185-5', '899-185-9', '903-186-0',
'904-186-1', '905-186-2', '906-186-3', '907-186-4', '908-187-0',
'952-201-0', '966-208-0', '1007-236-0', '1013-242-0', '1015-244-0',
'1029-257-0', '1032-260-0', '1034-262-0']`
- live_parallel: 11 entries
- `['1-0-1', '2-0-2', '3-0-3', '5-2-0', '6-3-0', '7-3-1', '10-6-0',
'11-7-0', '13-9-0', '14-10-0', '15-11-0']`
- live_parallel_multiple: 17 entries
- `['0-0-0', '1-1-0', '4-3-0', '5-4-0', '6-5-0', '8-7-0', '9-8-0',
'10-9-0', '11-10-0', '14-12-0', '16-14-0', '18-16-0', '19-16-1',
'20-17-0', '21-18-0', '22-19-0', '23-20-0']`
- live_simple: 104 entries
- `['2-2-0', '3-2-1', '4-3-0', '5-3-1', '6-3-2', '7-3-3', '8-3-4',
'9-3-5', '10-3-6', '11-3-7', '12-3-8', '13-3-9', '14-3-10', '15-3-11',
'16-3-12', '17-3-13', '18-3-14', '19-3-15', '20-4-0', '21-4-1',
'22-5-0', '24-5-2', '30-8-0', '38-15-0', '41-17-1', '49-21-1',
'50-22-0', '51-23-0', '53-24-0', '56-26-0', '78-39-0', '79-40-0',
'84-45-0', '96-57-0', '97-57-1', '102-61-0', '112-68-0', '114-70-0',
'120-76-0', '131-84-1', '141-94-0', '142-94-1', '143-95-0', '144-95-1',
'145-95-2', '146-95-3', '147-95-4', '148-95-5', '149-95-6', '150-95-7',
'151-95-8', '152-95-9', '153-95-10', '154-95-11', '155-95-12',
'156-95-13', '157-95-14', '158-95-15', '159-95-16', '160-95-17',
'166-99-0', '167-99-1', '171-99-5', '172-99-6', '173-99-7', '183-108-0',
'184-109-0', '190-115-0', '191-115-1', '208-117-0', '209-117-1',
'210-117-2', '211-117-3', '212-117-4', '213-117-5', '214-117-6',
'215-117-7', '216-117-8', '217-117-9', '218-117-10', '219-117-11',
'220-117-12', '221-117-13', '222-117-14', '223-117-15', '224-117-16',
'225-117-17', '226-118-0', '227-118-1', '228-119-0', '233-123-0',
'234-123-1', '235-124-0', '236-124-1', '237-125-0', '238-125-1',
'239-125-2', '240-125-3', '241-125-4', '248-130-0', '250-132-0',
'252-134-0', '253-135-0', '256-137-0']`
- parallel_multiple: 2 entries
  - `['83', '94']`

This will affect the leaderboard score. We will update it in a separate
PR.

We want to thank @budhiraja for pointing out these dataset issues.

---------

Co-authored-by: VishnuSuresh27 <[email protected]>
Co-authored-by: AndyChenYH <[email protected]>
Co-authored-by: Fanjia Yan <[email protected]>
Co-authored-by: Charlie Cheng-Jie Ji <[email protected]>
Co-authored-by: Shishir Patil <[email protected]>
Co-authored-by: Jason <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFCL-Dataset BFCL Dataset-Related Issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants