From 0e7c1b06c32ab90e0d3cf64825ed51eedd715509 Mon Sep 17 00:00:00 2001 From: Yan Gao Date: Sat, 7 Sep 2024 09:53:22 +0100 Subject: [PATCH] fix(benchmarks:skip) Update accuracy values for code challenge (#4157) --- benchmarks/flowertune-llm/evaluation/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/benchmarks/flowertune-llm/evaluation/README.md b/benchmarks/flowertune-llm/evaluation/README.md index 1b6383df296a..d7216c089d8a 100644 --- a/benchmarks/flowertune-llm/evaluation/README.md +++ b/benchmarks/flowertune-llm/evaluation/README.md @@ -37,7 +37,7 @@ The default template generated by `flwr new` (see the [Project Creation Instruct | | MBPP | HumanEval | MultiPL-E (JS) | MultiPL-E (C++) | Avg | |:----------:|:-----:|:---------:|:--------------:|:---------------:|:-----:| -| Pass@1 (%) | 32.60 | 26.83 | 29.81 | 24.22 | 28.37 | +| Pass@1 (%) | 31.60 | 23.78 | 28.57 | 25.47 | 27.36 | ## Make submission on FlowerTune LLM Leaderboard