Skip to content

Commit

Permalink
Function call benchmark for llama3.1 8b & 70b (#136)
Browse files Browse the repository at this point in the history
* Function call benchmark for llama3.1

* update results of llama3.1 - 70b
  • Loading branch information
tybalex authored Jul 24, 2024
1 parent b9eda55 commit b0ade28
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/docs/benchmark.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ Some of the LLMs above require using custom libraries to post-process LLM genera

`groq/Llama-3-Groq-8B-Tool-Use` and `groq/Llama-3-Groq-70B-Tool-Use` are tested using [groq's API](https://console.groq.com/docs/tool-use).

`Meta/Llama-3.1-8B-Instruct` and `Meta/Llama-3.1-70B-Instruct` are tested using Meta's [Llama3.1 official docs](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/#user-defined-custom-tool-calling) of User-defined Custom tool calling.

:::::

`Nexusflow/NexusRaven-V2-13B` and `gorilla-llm/gorilla-openfunctions-v2` don't accept tool observations, the result of running a tool or function once the LLM calls it, so we appended the observation to the prompt.
20 changes: 20 additions & 0 deletions docs/src/components/BenchmarkTable.js
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,26 @@ const data = [
gsm8k: '-',
math: '-',
mtBench:'-',
},
{
model: 'Meta/Llama-3.1-8B-Instruct',
params: 8.03,
functionCalling: '32.50%',
mmlu: '-',
gpqa: '-',
gsm8k: '-',
math: '-',
mtBench:'-',
},
{
model: 'Meta/Llama-3.1-70B-Instruct',
params: 70.6,
functionCalling: '63.75%%',
mmlu: '-',
gpqa: '-',
gsm8k: '-',
math: '-',
mtBench:'-',
}
];

Expand Down

0 comments on commit b0ade28

Please sign in to comment.