Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark runner instantiates annotators #510

Merged
merged 7 commits into from
Oct 2, 2024
Merged

Conversation

bkorycki
Copy link
Contributor

PR #487 did not apply its test annotator instantiation change to modelbench's runner. This fixes that.

I also snuck in a bug fix for the "sxc" prompt issue that was causing some tests to fail.

@bkorycki bkorycki requested a review from a team as a code owner September 27, 2024 05:19
Copy link

github-actions bot commented Sep 27, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

src/modelbench/benchmark_runner.py Outdated Show resolved Hide resolved
src/modelbench/benchmark_runner.py Outdated Show resolved Hide resolved
src/modelbench/benchmark_runner.py Outdated Show resolved Hide resolved
@@ -36,26 +36,35 @@ def fake_all_secrets(value="some-value") -> RawSecrets:
return raw_secrets


class FakeExplodingAnnotator(FakeAnnotator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pollutes a global variable with test-specific data, and as a general rule tests should not have side effects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the registration of the fake annotators? If so, I agree that it's problematic. However, tests' get_annotators just returns UIDs and then the runners get the actual annotators from the registry now.
Do you have suggestions to avoid test side effects on ANNOTATORS while still testing the runners' annotator-specific functionality?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I prefer to avoid global variables, which neatly solves this problem. But since fixing that is probably out of scope for this, I think the next-best solution is either setup/teardown (which add the fakes and then restore state after the test has run) or mocking such that the test diverts the production code so that it uses a fake registry which is discarded after the test has run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! Updated.

@bkorycki bkorycki requested a review from wpietri October 1, 2024 22:04
Copy link
Contributor

@wpietri wpietri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great now.

Copy link
Collaborator

@dhosterman dhosterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks great!

@bkorycki bkorycki merged commit 0ce1e70 into main Oct 2, 2024
4 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Oct 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants