About the evaluation of OpenChat-13b-V3.2 etc. #342

niexufei · 2023-09-01T08:39:27Z

niexufei
Sep 1, 2023

On the AlpacaEval leaderboard, models like OpenChat-13b-V3.2, WizardLM-13B-V1.2, and Vicuna-13b-V1.5-16k have shown outstanding performance. Does OpenCompass have plans to support evaluation for these models?
Thanks a lot.

Answered by tonysy

Sep 1, 2023

Thanks for the suggestions, we have added these models into our evaluation plan, results will be updated in next one or two weeks.

View full answer

tonysy · 2023-09-01T16:40:14Z

tonysy
Sep 1, 2023
Maintainer

Thanks for the suggestions, we have added these models into our evaluation plan, results will be updated in next one or two weeks.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the evaluation of OpenChat-13b-V3.2 etc. #342

{{title}}

Replies: 1 comment

{{title}}

Select a reply

About the evaluation of OpenChat-13b-V3.2 etc. #342

niexufei Sep 1, 2023

Replies: 1 comment

tonysy Sep 1, 2023 Maintainer

niexufei
Sep 1, 2023

tonysy
Sep 1, 2023
Maintainer