Release Berkeley Function Calling Leaderboard Updates (v1.0) · ShishirPatil/gorilla

Highlights

🏆 We are thrilled to announce the stable v1.0 release of the Berkeley Function Calling Leaderboard data-set and eval-pipeline! A heartfelt thank you to all our contributors and users for your enthusiastic engagement and support throughout v1. We are just getting started! Buckle-up for v2 🚀 🚀 🚀

What's Changed

better handle float value comparison by @vandyxiaowei in #407
Bump pymysql from 1.1.0 to 1.1.1 in /goex by @dependabot in #453
Fixes For NexusHandler by @VenkatKS in #437
[BFCL] PR#407 Evaluation Pipeline Robustness Patch by @HuanzhiMao in #462
Add firefunction-v2 to the leaderboard by @pgarbacki in #470
[BFCL] Add Claude 3.5 Sonnet Function Calling Infernece Inference by @Fanjia-Yan in #480
[BFCL] Standardize Model Name Among handler_map and eval_runner_helper by @HuanzhiMao in #439
Remove redundant tokens from GPT-handler by @hellovai in #490
[GoEx] Undo Minor Bug Fix + README Minor Improvement by @royh02 in #468
[BFCL] Add ability to evaluate Nemotron-4-340B-Instruct by @Fanjia-Yan in #489
fix some data issues in parallel/parallel multiple answers by @vandyxiaowei in #423
[BFCL] Add Support for GLM-4-9B function calling inference by @Fanjia-Yan in #474
[BFCL] Sanity check is now optional by @ShishirPatil in #496
[BFCL] Improved tree-sitter java, javascript installation by @CharlieJCJ in #505
[BFCL] Fix Possible Answer for AST Parallel and Parallel_Multiple Category by @HuanzhiMao in #503
[BFCL] Add Test Dataset to Repository by @HuanzhiMao in #504
[BFCL] Support Category-Specific Generation for OSS Model, Remove eval_data_compilation Step by @HuanzhiMao in #512
[BFCL] Fix Double-Casting Issue in model_handler for Java and JS category. by @HuanzhiMao in #516
[BFCL] Fix Dataset Issue for executable_parallel_multiple Category by @HuanzhiMao in #522
[BFCL] add ibm-granite-20b-functioncallling model by @MayankAgarwal in #525
[BFCL] Overhaul apply_function_credential_config.py for Enhanced Usability by @HuanzhiMao in #508
Fixed the warning message "Setting pad_token_id to eos_token_id:1… by @dineshkumarsarangapani in #110
[BFCL] Specify package version in requirements.txt by @HuanzhiMao in #515
[BFCL] Standardize TEST_CATEGORY Among eval_runner.py and openfunctions_evaluation.py by @HuanzhiMao in #506
fix line return by @fantasist in #531
[BFCL] Apply Fix to Newly Introduced Model Handler Missed in Previous PR Merge by @HuanzhiMao in #536
[RAFT] Fix Datapoint Field in Formatter for Data Generation by @HuanzhiMao in #535
[BFCL] Fix language_specific_pre_processing for Java and JavaScript Test Category by @HuanzhiMao in #538
[BFCL] Patch Generation Script for Locally Hosted OSS model by @HuanzhiMao in #537
[BFCL] Support Multi-Model Multi-Category Generation; Add Index to Dataset; Handle vLLM Benign Error by @HuanzhiMao in #540
Add NousResearch/{Hermes-2-Pro-Llama-3-8B,Hermes-2-Theta-Llama-3-8B} models by @alonsosilvaallende in #542
[BFCL] Fix Dataset Pre-Processing for Java and JavaScript Test Category, Part 2 by @HuanzhiMao in #545
Add Salesforce xLAM handler and fix minor issues by @zuxin666 in #532
Add NousResearch/Hermes-2-{Pro-Llama-3-80B,Theta-Llama-3-80B} by @alonsosilvaallende in #556
Add Yi Handler by @fantasist in #543
Add more descriptive error message in eval_runner.py by @alonsosilvaallende in #552
[BFCL] Fix JS type converter to handle dictionaries with array values by @CharlieJCJ in #549
[BFCL] Handling rate limits by @ShishirPatil in #559
[BFCL] Fix Dataset and Possible Answer Issue by @HuanzhiMao in #557
[BFCL] Dataset Question Fix for Executable Parallel Category by @HuanzhiMao in #568
[BFCL] Add New Model gpt-4o-2024-08-06, gpt-4o-mini-2024-07-18 by @HuanzhiMao in #569
[BFCL] Add New Model open-mistral-nemo-2407, open-mixtral-8x22b, open-mixtral-8x7b by @HuanzhiMao in #570
[BFCL] Improve Warning Message when Aggregating Results by @HuanzhiMao in #517
[BFCL] Add New Model functionary-small-v3.1, functionary-small-v3.2, functionary-medium-v3.1; Update Token Price by @HuanzhiMao in #573
[BFCL] Set Model Temperature to 0.001 for All Models by @HuanzhiMao in #574
[BFCL] Support Parallel Inference for Hosted Models by @HuanzhiMao in #571
[BFCL Chore] Fix Functionary Medium 3.1 model name & add readme parallel inference by @CharlieJCJ in #577

New Contributors

@dependabot made their first contribution in #453
@VenkatKS made their first contribution in #437
@pgarbacki made their first contribution in #470
@hellovai made their first contribution in #490
@MayankAgarwal made their first contribution in #525
@dineshkumarsarangapani made their first contribution in #110
@fantasist made their first contribution in #531
@alonsosilvaallende made their first contribution in #542

Full Changelog: v0.3...v1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Berkeley Function Calling Leaderboard Updates (v1.0)

Highlights

What's Changed

New Contributors

Contributors