-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BFCL-v2] Dataset and Possible Answer Fix #661
Conversation
berkeley-function-call-leaderboard/data/possible_answer/BFCL_v3_live_multiple.json
Outdated
Show resolved
Hide resolved
berkeley-function-call-leaderboard/data/possible_answer/BFCL_v3_parallel_multiple.json
Show resolved
Hide resolved
berkeley-function-call-leaderboard/data/possible_answer/BFCL_v3_live_simple.json
Show resolved
Hide resolved
thanks, here are some more ids where Ground truth seems ambiguous. Can you please take a look? happy to accept the review for this one and follow up in next PR if you prefer that live_simple_117-73-0, live_simple_33-10-0 [these two are a bit more worrysome in terms of prompts], |
…multiple 0-150, 550-600 (#3) * Cleaned live parallel, parallel_multiple and simple * Cleaned Line 0-150 and 550-600 of live_multiple * Reinstated white spaces * Edited Linux based instructions * fix parallel_multiple_10 * Fixed cmd_controller.execute ambiguity * Remove non-default null from live ground truth files. * Reinstate unit parameter --------- Co-authored-by: Huanzhi (Hans) Mao <[email protected]>
* cleaned 150-550 of live_multiple * removed default parameter in prompts, added space formatting to json * collapsed json file * added whitespace * finished cross check cleaning * fix --------- Co-authored-by: AndyChenYH <[email protected]> Co-authored-by: Fanjia Yan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
approving it since no major concerns. please update the changelog (in case its not correct) before merging this PR. |
…live_multiple_1036-263-1) (#13) * cleaned v3 live multiple entries 601-1037 --------- Co-authored-by: Huanzhi Mao <[email protected]>
This PR mainly addresses issues with the Live dataset. The change log has been updated with the latest numbers. |
This PR updates the leaderboard to reflect the change in score due to the following PR merge: 1. #660 2. #661 3. #683 4. #679 5. #708 6. #709 7. #701 8. #657 9. #658 10. #640 11. #653 12. #642 13. #696 14. #667 Close #662. Note: Some models (like `firefunction`, `functionary`, `microsoft/phi`)are not included in this leaderboard update because we don't have all the entries generated. We will add them back once we get the full result generated.
Total number of entries affected: 8 - irrelevance: 1 entries - `['196']` - live_multiple: 547 entries - `['0-0-0', '1-0-1', '4-2-1', '8-4-0', '13-4-5', '18-4-10', '23-5-0', '24-5-1', '25-6-0', '31-10-1', '34-11-0', '35-11-1', '39-14-1', '42-16-1', '44-17-0', '46-18-1', '50-20-0', '53-22-0', '55-22-2', '59-22-6', '61-23-0', '63-25-0', '64-26-0', '74-34-0', '81-36-2', '88-38-5', '103-43-1', '115-45-0', '144-56-0', '146-58-0', '150-58-4', '152-58-6', '153-58-7', '169-69-0', '170-70-0', '171-71-0', '174-72-0', '199-90-1', '206-91-0', '207-91-1', '209-91-3', '220-94-2', '224-98-0', '228-102-0', '247-111-0', '259-123-0', '263-126-0', '264-126-1', '265-127-0', '266-127-1', '267-127-2', '268-127-3', '269-127-4', '270-127-5', '271-127-6', '272-127-7', '273-127-8', '274-127-9', '275-127-10', '276-127-11', '277-128-0', '278-128-1', '279-128-2', '280-128-3', '281-128-4', '282-128-5', '283-128-6', '284-128-7', '285-129-0', '286-129-1', '287-129-2', '288-129-3', '289-129-4', '290-129-5', '291-130-0', '292-130-1', '293-130-2', '294-130-3', '295-130-4', '296-130-5', '297-130-6', '298-130-7', '299-130-8', '300-130-9', '301-131-0', '302-131-1', '303-131-2', '304-131-3', '305-131-4', '307-131-6', '308-131-7', '309-131-8', '310-132-0', '311-132-1', '312-132-2', '313-132-3', '314-132-4', '315-132-5', '316-132-6', '317-132-7', '318-132-8', '319-132-9', '320-132-10', '321-132-11', '322-132-12', '323-132-13', '324-132-14', '325-132-15', '326-132-16', '327-132-17', '328-132-18', '329-132-19', '330-132-20', '331-132-21', '332-132-22', '333-132-23', '334-132-24', '335-132-25', '337-133-1', '356-134-1', '357-134-2', '358-134-3', '359-134-4', '360-134-5', '361-134-6', '362-134-7', '363-134-8', '364-134-9', '365-134-10', '366-134-11', '367-134-12', '368-134-13', '369-134-14', '370-134-15', '371-134-16', '372-134-17', '373-134-18', '374-134-19', '375-134-20', '376-135-0', '377-135-1', '378-135-2', '382-137-0', '383-137-1', '384-137-2', '385-137-3', '386-137-4', '387-137-5', '388-137-6', '389-137-7', '390-137-8', '391-137-9', '392-138-0', '393-138-1', '394-138-2', '395-138-3', '396-139-0', '397-139-1', '398-139-2', '399-139-3', '400-139-4', '401-139-5', '404-140-0', '405-140-1', '406-140-2', '407-140-3', '408-140-4', '409-140-5', '410-140-6', '411-141-0', '412-141-1', '413-141-2', '414-141-3', '415-141-4', '416-141-5', '418-141-7', '419-141-8', '420-141-9', '422-141-11', '423-141-12', '424-141-13', '425-141-14', '426-141-15', '427-141-16', '428-141-17', '429-141-18', '430-141-19', '431-141-20', '432-141-21', '433-141-22', '434-142-0', '435-142-1', '436-142-2', '437-142-3', '438-142-4', '439-143-0', '440-144-0', '441-144-1', '442-144-2', '443-144-3', '444-144-4', '445-144-5', '446-144-6', '447-144-7', '448-144-8', '449-145-0', '451-145-2', '452-145-3', '453-145-4', '454-145-5', '455-145-6', '456-145-7', '460-145-11', '462-145-13', '463-145-14', '464-145-15', '465-145-16', '466-145-17', '467-145-18', '468-145-19', '471-145-22', '472-145-23', '473-145-24', '478-146-3', '482-146-7', '485-147-0', '486-147-1', '487-147-2', '488-147-3', '489-147-4', '490-148-0', '491-148-1', '492-148-2', '493-148-3', '494-148-4', '495-148-5', '496-148-6', '497-148-7', '498-148-8', '499-148-9', '500-148-10', '501-148-11', '502-148-12', '503-149-0', '504-149-1', '505-149-2', '506-149-3', '507-149-4', '508-149-5', '509-149-6', '510-149-7', '511-149-8', '512-150-0', '513-150-1', '514-150-2', '515-150-3', '516-150-4', '517-150-5', '518-150-6', '519-150-7', '520-150-8', '521-150-9', '522-150-10', '523-150-11', '524-151-0', '526-151-2', '527-151-3', '528-151-4', '529-151-5', '530-151-6', '531-151-7', '532-151-8', '533-151-9', '534-151-10', '535-151-11', '536-151-12', '537-151-13', '538-152-0', '539-152-1', '540-152-2', '541-152-3', '542-152-4', '543-152-5', '544-152-6', '545-152-7', '546-152-8', '547-152-9', '548-152-10', '549-152-11', '551-153-0', '554-154-0', '555-154-1', '556-154-2', '557-154-3', '559-154-5', '560-155-0', '561-155-1', '562-155-2', '563-155-3', '564-155-4', '565-155-5', '566-155-6', '567-155-7', '568-155-8', '569-155-9', '570-155-10', '571-155-11', '572-155-12', '573-155-13', '594-158-0', '595-158-1', '596-158-2', '597-158-3', '598-158-4', '599-158-5', '600-158-6', '601-158-7', '602-158-8', '603-158-9', '604-158-10', '605-158-11', '606-158-12', '607-159-0', '608-159-1', '609-159-2', '610-159-3', '611-159-4', '612-159-5', '613-159-6', '614-159-7', '615-159-8', '616-159-9', '617-159-10', '618-159-11', '619-159-12', '620-160-0', '632-161-0', '633-161-1', '634-161-2', '635-161-3', '636-161-4', '637-161-5', '638-161-6', '639-161-7', '640-161-8', '641-161-9', '642-161-10', '643-161-11', '644-161-12', '645-161-13', '646-161-14', '647-161-15', '648-161-16', '649-161-17', '650-161-18', '651-161-19', '652-161-20', '653-161-21', '654-161-22', '655-161-23', '656-161-24', '657-161-25', '658-162-0', '659-162-1', '660-162-2', '661-162-3', '662-162-4', '663-162-5', '664-162-6', '665-162-7', '666-162-8', '667-162-9', '668-162-10', '669-162-11', '670-162-12', '671-162-13', '672-162-14', '673-162-15', '674-162-16', '675-163-0', '676-163-1', '677-163-2', '678-163-3', '679-163-4', '680-163-5', '681-163-6', '682-163-7', '683-163-8', '684-164-0', '685-164-1', '686-164-2', '687-164-3', '688-164-4', '689-164-5', '690-164-6', '691-164-7', '692-164-8', '693-164-9', '694-164-10', '695-164-11', '696-164-12', '697-164-13', '698-164-14', '699-164-15', '700-164-16', '701-164-17', '702-164-18', '703-164-19', '704-164-20', '705-164-21', '706-164-22', '707-164-23', '708-164-24', '709-164-25', '710-164-26', '711-164-27', '712-164-28', '713-165-0', '714-165-1', '715-165-2', '716-165-3', '717-165-4', '718-165-5', '719-165-6', '720-165-7', '721-165-8', '722-165-9', '723-165-10', '724-165-11', '729-167-0', '730-167-1', '731-167-2', '732-167-3', '733-167-4', '734-167-5', '735-167-6', '736-167-7', '737-167-8', '738-168-0', '739-168-1', '740-168-2', '741-168-3', '742-168-4', '743-168-5', '744-168-6', '745-169-0', '746-169-1', '747-169-2', '748-169-3', '749-169-4', '750-169-5', '751-169-6', '752-169-7', '753-169-8', '754-169-9', '755-169-10', '756-169-11', '757-169-12', '758-169-13', '759-169-14', '760-169-15', '761-169-16', '778-173-0', '783-173-5', '794-175-0', '795-175-1', '796-175-2', '797-175-3', '798-175-4', '799-175-5', '800-175-6', '801-175-7', '802-175-8', '803-175-9', '804-175-10', '805-175-11', '806-175-12', '807-175-13', '808-175-14', '817-177-0', '818-177-1', '820-177-3', '821-177-4', '824-177-7', '825-178-0', '826-178-1', '827-178-2', '828-178-3', '829-178-4', '830-178-5', '831-178-6', '832-178-7', '833-178-8', '834-178-9', '835-178-10', '836-178-11', '837-178-12', '838-178-13', '839-178-14', '840-178-15', '841-178-16', '842-178-17', '843-178-18', '844-178-19', '845-178-20', '848-179-2', '849-179-3', '852-180-0', '853-180-1', '854-180-2', '855-180-3', '856-180-4', '857-180-5', '858-180-6', '866-182-3', '873-182-10', '875-183-0', '876-183-1', '877-183-2', '878-183-3', '879-183-4', '880-183-5', '881-183-6', '882-183-7', '883-184-0', '884-184-1', '885-184-2', '886-184-3', '887-184-4', '888-184-5', '889-184-6', '895-185-5', '899-185-9', '903-186-0', '904-186-1', '905-186-2', '906-186-3', '907-186-4', '908-187-0', '952-201-0', '966-208-0', '1007-236-0', '1013-242-0', '1015-244-0', '1029-257-0', '1032-260-0', '1034-262-0']` - live_parallel: 11 entries - `['1-0-1', '2-0-2', '3-0-3', '5-2-0', '6-3-0', '7-3-1', '10-6-0', '11-7-0', '13-9-0', '14-10-0', '15-11-0']` - live_parallel_multiple: 17 entries - `['0-0-0', '1-1-0', '4-3-0', '5-4-0', '6-5-0', '8-7-0', '9-8-0', '10-9-0', '11-10-0', '14-12-0', '16-14-0', '18-16-0', '19-16-1', '20-17-0', '21-18-0', '22-19-0', '23-20-0']` - live_simple: 104 entries - `['2-2-0', '3-2-1', '4-3-0', '5-3-1', '6-3-2', '7-3-3', '8-3-4', '9-3-5', '10-3-6', '11-3-7', '12-3-8', '13-3-9', '14-3-10', '15-3-11', '16-3-12', '17-3-13', '18-3-14', '19-3-15', '20-4-0', '21-4-1', '22-5-0', '24-5-2', '30-8-0', '38-15-0', '41-17-1', '49-21-1', '50-22-0', '51-23-0', '53-24-0', '56-26-0', '78-39-0', '79-40-0', '84-45-0', '96-57-0', '97-57-1', '102-61-0', '112-68-0', '114-70-0', '120-76-0', '131-84-1', '141-94-0', '142-94-1', '143-95-0', '144-95-1', '145-95-2', '146-95-3', '147-95-4', '148-95-5', '149-95-6', '150-95-7', '151-95-8', '152-95-9', '153-95-10', '154-95-11', '155-95-12', '156-95-13', '157-95-14', '158-95-15', '159-95-16', '160-95-17', '166-99-0', '167-99-1', '171-99-5', '172-99-6', '173-99-7', '183-108-0', '184-109-0', '190-115-0', '191-115-1', '208-117-0', '209-117-1', '210-117-2', '211-117-3', '212-117-4', '213-117-5', '214-117-6', '215-117-7', '216-117-8', '217-117-9', '218-117-10', '219-117-11', '220-117-12', '221-117-13', '222-117-14', '223-117-15', '224-117-16', '225-117-17', '226-118-0', '227-118-1', '228-119-0', '233-123-0', '234-123-1', '235-124-0', '236-124-1', '237-125-0', '238-125-1', '239-125-2', '240-125-3', '241-125-4', '248-130-0', '250-132-0', '252-134-0', '253-135-0', '256-137-0']` - parallel_multiple: 2 entries - `['83', '94']` This will affect the leaderboard score. We will update it in a separate PR. We want to thank @budhiraja for pointing out these dataset issues. --------- Co-authored-by: VishnuSuresh27 <[email protected]> Co-authored-by: AndyChenYH <[email protected]> Co-authored-by: Fanjia Yan <[email protected]> Co-authored-by: Charlie Cheng-Jie Ji <[email protected]> Co-authored-by: Shishir Patil <[email protected]> Co-authored-by: Jason <[email protected]>
Total number of entries affected: 8
['196']
['0-0-0', '1-0-1', '4-2-1', '8-4-0', '13-4-5', '18-4-10', '23-5-0', '24-5-1', '25-6-0', '31-10-1', '34-11-0', '35-11-1', '39-14-1', '42-16-1', '44-17-0', '46-18-1', '50-20-0', '53-22-0', '55-22-2', '59-22-6', '61-23-0', '63-25-0', '64-26-0', '74-34-0', '81-36-2', '88-38-5', '103-43-1', '115-45-0', '144-56-0', '146-58-0', '150-58-4', '152-58-6', '153-58-7', '169-69-0', '170-70-0', '171-71-0', '174-72-0', '199-90-1', '206-91-0', '207-91-1', '209-91-3', '220-94-2', '224-98-0', '228-102-0', '247-111-0', '259-123-0', '263-126-0', '264-126-1', '265-127-0', '266-127-1', '267-127-2', '268-127-3', '269-127-4', '270-127-5', '271-127-6', '272-127-7', '273-127-8', '274-127-9', '275-127-10', '276-127-11', '277-128-0', '278-128-1', '279-128-2', '280-128-3', '281-128-4', '282-128-5', '283-128-6', '284-128-7', '285-129-0', '286-129-1', '287-129-2', '288-129-3', '289-129-4', '290-129-5', '291-130-0', '292-130-1', '293-130-2', '294-130-3', '295-130-4', '296-130-5', '297-130-6', '298-130-7', '299-130-8', '300-130-9', '301-131-0', '302-131-1', '303-131-2', '304-131-3', '305-131-4', '307-131-6', '308-131-7', '309-131-8', '310-132-0', '311-132-1', '312-132-2', '313-132-3', '314-132-4', '315-132-5', '316-132-6', '317-132-7', '318-132-8', '319-132-9', '320-132-10', '321-132-11', '322-132-12', '323-132-13', '324-132-14', '325-132-15', '326-132-16', '327-132-17', '328-132-18', '329-132-19', '330-132-20', '331-132-21', '332-132-22', '333-132-23', '334-132-24', '335-132-25', '337-133-1', '356-134-1', '357-134-2', '358-134-3', '359-134-4', '360-134-5', '361-134-6', '362-134-7', '363-134-8', '364-134-9', '365-134-10', '366-134-11', '367-134-12', '368-134-13', '369-134-14', '370-134-15', '371-134-16', '372-134-17', '373-134-18', '374-134-19', '375-134-20', '376-135-0', '377-135-1', '378-135-2', '382-137-0', '383-137-1', '384-137-2', '385-137-3', '386-137-4', '387-137-5', '388-137-6', '389-137-7', '390-137-8', '391-137-9', '392-138-0', '393-138-1', '394-138-2', '395-138-3', '396-139-0', '397-139-1', '398-139-2', '399-139-3', '400-139-4', '401-139-5', '404-140-0', '405-140-1', '406-140-2', '407-140-3', '408-140-4', '409-140-5', '410-140-6', '411-141-0', '412-141-1', '413-141-2', '414-141-3', '415-141-4', '416-141-5', '418-141-7', '419-141-8', '420-141-9', '422-141-11', '423-141-12', '424-141-13', '425-141-14', '426-141-15', '427-141-16', '428-141-17', '429-141-18', '430-141-19', '431-141-20', '432-141-21', '433-141-22', '434-142-0', '435-142-1', '436-142-2', '437-142-3', '438-142-4', '439-143-0', '440-144-0', '441-144-1', '442-144-2', '443-144-3', '444-144-4', '445-144-5', '446-144-6', '447-144-7', '448-144-8', '449-145-0', '451-145-2', '452-145-3', '453-145-4', '454-145-5', '455-145-6', '456-145-7', '460-145-11', '462-145-13', '463-145-14', '464-145-15', '465-145-16', '466-145-17', '467-145-18', '468-145-19', '471-145-22', '472-145-23', '473-145-24', '478-146-3', '482-146-7', '485-147-0', '486-147-1', '487-147-2', '488-147-3', '489-147-4', '490-148-0', '491-148-1', '492-148-2', '493-148-3', '494-148-4', '495-148-5', '496-148-6', '497-148-7', '498-148-8', '499-148-9', '500-148-10', '501-148-11', '502-148-12', '503-149-0', '504-149-1', '505-149-2', '506-149-3', '507-149-4', '508-149-5', '509-149-6', '510-149-7', '511-149-8', '512-150-0', '513-150-1', '514-150-2', '515-150-3', '516-150-4', '517-150-5', '518-150-6', '519-150-7', '520-150-8', '521-150-9', '522-150-10', '523-150-11', '524-151-0', '526-151-2', '527-151-3', '528-151-4', '529-151-5', '530-151-6', '531-151-7', '532-151-8', '533-151-9', '534-151-10', '535-151-11', '536-151-12', '537-151-13', '538-152-0', '539-152-1', '540-152-2', '541-152-3', '542-152-4', '543-152-5', '544-152-6', '545-152-7', '546-152-8', '547-152-9', '548-152-10', '549-152-11', '551-153-0', '554-154-0', '555-154-1', '556-154-2', '557-154-3', '559-154-5', '560-155-0', '561-155-1', '562-155-2', '563-155-3', '564-155-4', '565-155-5', '566-155-6', '567-155-7', '568-155-8', '569-155-9', '570-155-10', '571-155-11', '572-155-12', '573-155-13', '594-158-0', '595-158-1', '596-158-2', '597-158-3', '598-158-4', '599-158-5', '600-158-6', '601-158-7', '602-158-8', '603-158-9', '604-158-10', '605-158-11', '606-158-12', '607-159-0', '608-159-1', '609-159-2', '610-159-3', '611-159-4', '612-159-5', '613-159-6', '614-159-7', '615-159-8', '616-159-9', '617-159-10', '618-159-11', '619-159-12', '620-160-0', '632-161-0', '633-161-1', '634-161-2', '635-161-3', '636-161-4', '637-161-5', '638-161-6', '639-161-7', '640-161-8', '641-161-9', '642-161-10', '643-161-11', '644-161-12', '645-161-13', '646-161-14', '647-161-15', '648-161-16', '649-161-17', '650-161-18', '651-161-19', '652-161-20', '653-161-21', '654-161-22', '655-161-23', '656-161-24', '657-161-25', '658-162-0', '659-162-1', '660-162-2', '661-162-3', '662-162-4', '663-162-5', '664-162-6', '665-162-7', '666-162-8', '667-162-9', '668-162-10', '669-162-11', '670-162-12', '671-162-13', '672-162-14', '673-162-15', '674-162-16', '675-163-0', '676-163-1', '677-163-2', '678-163-3', '679-163-4', '680-163-5', '681-163-6', '682-163-7', '683-163-8', '684-164-0', '685-164-1', '686-164-2', '687-164-3', '688-164-4', '689-164-5', '690-164-6', '691-164-7', '692-164-8', '693-164-9', '694-164-10', '695-164-11', '696-164-12', '697-164-13', '698-164-14', '699-164-15', '700-164-16', '701-164-17', '702-164-18', '703-164-19', '704-164-20', '705-164-21', '706-164-22', '707-164-23', '708-164-24', '709-164-25', '710-164-26', '711-164-27', '712-164-28', '713-165-0', '714-165-1', '715-165-2', '716-165-3', '717-165-4', '718-165-5', '719-165-6', '720-165-7', '721-165-8', '722-165-9', '723-165-10', '724-165-11', '729-167-0', '730-167-1', '731-167-2', '732-167-3', '733-167-4', '734-167-5', '735-167-6', '736-167-7', '737-167-8', '738-168-0', '739-168-1', '740-168-2', '741-168-3', '742-168-4', '743-168-5', '744-168-6', '745-169-0', '746-169-1', '747-169-2', '748-169-3', '749-169-4', '750-169-5', '751-169-6', '752-169-7', '753-169-8', '754-169-9', '755-169-10', '756-169-11', '757-169-12', '758-169-13', '759-169-14', '760-169-15', '761-169-16', '778-173-0', '783-173-5', '794-175-0', '795-175-1', '796-175-2', '797-175-3', '798-175-4', '799-175-5', '800-175-6', '801-175-7', '802-175-8', '803-175-9', '804-175-10', '805-175-11', '806-175-12', '807-175-13', '808-175-14', '817-177-0', '818-177-1', '820-177-3', '821-177-4', '824-177-7', '825-178-0', '826-178-1', '827-178-2', '828-178-3', '829-178-4', '830-178-5', '831-178-6', '832-178-7', '833-178-8', '834-178-9', '835-178-10', '836-178-11', '837-178-12', '838-178-13', '839-178-14', '840-178-15', '841-178-16', '842-178-17', '843-178-18', '844-178-19', '845-178-20', '848-179-2', '849-179-3', '852-180-0', '853-180-1', '854-180-2', '855-180-3', '856-180-4', '857-180-5', '858-180-6', '866-182-3', '873-182-10', '875-183-0', '876-183-1', '877-183-2', '878-183-3', '879-183-4', '880-183-5', '881-183-6', '882-183-7', '883-184-0', '884-184-1', '885-184-2', '886-184-3', '887-184-4', '888-184-5', '889-184-6', '895-185-5', '899-185-9', '903-186-0', '904-186-1', '905-186-2', '906-186-3', '907-186-4', '908-187-0', '952-201-0', '966-208-0', '1007-236-0', '1013-242-0', '1015-244-0', '1029-257-0', '1032-260-0', '1034-262-0']
['1-0-1', '2-0-2', '3-0-3', '5-2-0', '6-3-0', '7-3-1', '10-6-0', '11-7-0', '13-9-0', '14-10-0', '15-11-0']
['0-0-0', '1-1-0', '4-3-0', '5-4-0', '6-5-0', '8-7-0', '9-8-0', '10-9-0', '11-10-0', '14-12-0', '16-14-0', '18-16-0', '19-16-1', '20-17-0', '21-18-0', '22-19-0', '23-20-0']
['2-2-0', '3-2-1', '4-3-0', '5-3-1', '6-3-2', '7-3-3', '8-3-4', '9-3-5', '10-3-6', '11-3-7', '12-3-8', '13-3-9', '14-3-10', '15-3-11', '16-3-12', '17-3-13', '18-3-14', '19-3-15', '20-4-0', '21-4-1', '22-5-0', '24-5-2', '30-8-0', '38-15-0', '41-17-1', '49-21-1', '50-22-0', '51-23-0', '53-24-0', '56-26-0', '78-39-0', '79-40-0', '84-45-0', '96-57-0', '97-57-1', '102-61-0', '112-68-0', '114-70-0', '120-76-0', '131-84-1', '141-94-0', '142-94-1', '143-95-0', '144-95-1', '145-95-2', '146-95-3', '147-95-4', '148-95-5', '149-95-6', '150-95-7', '151-95-8', '152-95-9', '153-95-10', '154-95-11', '155-95-12', '156-95-13', '157-95-14', '158-95-15', '159-95-16', '160-95-17', '166-99-0', '167-99-1', '171-99-5', '172-99-6', '173-99-7', '183-108-0', '184-109-0', '190-115-0', '191-115-1', '208-117-0', '209-117-1', '210-117-2', '211-117-3', '212-117-4', '213-117-5', '214-117-6', '215-117-7', '216-117-8', '217-117-9', '218-117-10', '219-117-11', '220-117-12', '221-117-13', '222-117-14', '223-117-15', '224-117-16', '225-117-17', '226-118-0', '227-118-1', '228-119-0', '233-123-0', '234-123-1', '235-124-0', '236-124-1', '237-125-0', '238-125-1', '239-125-2', '240-125-3', '241-125-4', '248-130-0', '250-132-0', '252-134-0', '253-135-0', '256-137-0']
['83', '94']
This will affect the leaderboard score. We will update it in a separate PR.
We want to thank @budhiraja for pointing out these dataset issues.