Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with MVBench Evaluation #227

Open
Backdrop9019 opened this issue Aug 24, 2024 · 0 comments
Open

Issue with MVBench Evaluation #227

Backdrop9019 opened this issue Aug 24, 2024 · 0 comments

Comments

@Backdrop9019
Copy link

It seems that there is an issue with the evaluation method in MVBench. Currently, the process of verifying correctness involves splitting the prediction and extracting only the first segment(word) to compare it with the correct answer. However, this approach causes any prediction that includes just a closing parenthesis “)” to be treated as correct.

I believe it is essential to add a step that verifies whether the alphabet of the answer option is correct.

While running your code, I mistakenly added an extra space in the answer prompt, making it “best option: ( “ and noticed a significant increase in performance.

It would be great if the evaluation method could be made more robust!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant