使用 mmlu 数据集进行模型效果的时候,为什么 mmlu_gen 比 mmlu_ppl 评估时间要多用几个小时? #401
amulil
started this conversation in
View all discussions
Replies: 2 comments
-
想请问一下你的 mmlu_ppl复现的结果和opencompass榜单上给的结果是一样的吗? |
Beta Was this translation helpful? Give feedback.
0 replies
-
The model tends to predict between 10 and 100 words in its generalization mode, which can noticeably slow down the inference process. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
rt.
还有这两种数据集评估具体有什么区别和适用场景,是 chat 模型推荐用 mmlu_gen、非 chat 模型用 mmlu_ppl 吗?
Beta Was this translation helpful? Give feedback.
All reactions