Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the comparison to lm-evaluation-harness #10

Open
gakada opened this issue Jun 13, 2023 · 0 comments
Open

Regarding the comparison to lm-evaluation-harness #10

gakada opened this issue Jun 13, 2023 · 0 comments

Comments

@gakada
Copy link

gakada commented Jun 13, 2023

For

Compared to existing libraries such as evaluation-harness and HELM, this repo enables simple and convenient evaluation for multiple models. Notably, we support most models from HuggingFace Transformers

isn't

python main.py mmlu --model_name llama --model_path some-llama

roughly the same as

python main.py --model_args pretrained=some-llama,... --tasks hendrycksTest* --num_fewshot 5

in lm-evaluation-harness? Or also python scripts/regression.py --models multiple-models --tasks multiple-tasks. It also supports most HF models and some OpenAI and Anthropic models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant