Which mmlu validation setting is recommend? #714

mathfinder · 2024-08-27T03:58:24Z

❓ The question

I found that you provide many mmlu test methods.
Take mmlu_stem as an example, including mmlu_stem_test, mmlu_stem, mmlu_stem_var, mmlu_stem_mc_5shot, mmlu_humanities_mc_5shot, mmlu_humanities_mc_5shot_test.
Which one is more recommended?

The text was updated successfully, but these errors were encountered:

aman-17 · 2024-10-22T18:47:59Z

For initial testing, it’s recommended to start with easier tasks like the 5-shot methods (e.g., mmlu_stem_mc_5shot or mmlu_humanities_mc_5shot_test). These are useful for evaluating the model’s ability to generalize with a few examples. However, for less capable models, it is not recommended to rely on multiple-choice (MC) tasks right away, as they may not perform well. The focus should be on simpler tasks to gauge the model’s baseline performance before moving to more complex evaluations like MC.

mathfinder added the type/question An issue that's a question label Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Which mmlu validation setting is recommend? #714

Which mmlu validation setting is recommend? #714

mathfinder commented Aug 27, 2024

aman-17 commented Oct 22, 2024

Which mmlu validation setting is recommend? #714

Which mmlu validation setting is recommend? #714

Comments

mathfinder commented Aug 27, 2024

❓ The question

aman-17 commented Oct 22, 2024