How to add new models to the leaderboard? #25

chujiezheng · 2024-06-01T22:10:48Z

Thanks for your great work. Can I request for evaluation for new models to add into the leaderboard?

CodingWithTim · 2024-06-02T22:39:31Z

Hi @chujiezheng, I am a fan of your works! We would love to add new models. Could you give us more information on the model you want to add? Currently we are just putting a very lightweight leaderboard on README doc.

chujiezheng · 2024-06-02T23:22:58Z

@CodingWithTim Thanks for your kind words! I have some HF models that I want to add:

They are ranked based on my educated guess for their performance. These models are obtained via our recently proposed ExPO (model extrapolation) method. You can find more ExPO-enhanced models in this 🤗 HuggingFace collection and see their performance on the AlpacaEval 2.0 leaderboard.

Due to the API and GPU limits, currently I have only ran the evaluation for Starling-LM-7B-beta-ExPO, which obtains a score of 24.9 and a 95% CI of (-2.2, 1.8). I attach the evaluation output files here. I will appreciate it if you could add Starling-LM-7B-beta-ExPO to the leaderboard. I will also greatly appreciate it if you could help evaluate the above other models and add them to the leaderboard.

BTW, as many research work has built their evaluation on Arena-Hard, do you have plans to build a leaderboard website like AlpacaEval?

CodingWithTim assigned CodingWithTim and infwinston Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to add new models to the leaderboard? #25

How to add new models to the leaderboard? #25

chujiezheng commented Jun 1, 2024

CodingWithTim commented Jun 2, 2024

chujiezheng commented Jun 2, 2024 •

edited

Loading

How to add new models to the leaderboard? #25

How to add new models to the leaderboard? #25

Comments

chujiezheng commented Jun 1, 2024

CodingWithTim commented Jun 2, 2024

chujiezheng commented Jun 2, 2024 • edited Loading

chujiezheng commented Jun 2, 2024 •

edited

Loading