-
-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Make ignoring test warnings more flexible #1165
Comments
I'd have to read this through in more details to properly understand it but my first thought is that I know Gymnasium handles certain errors in their envs in testing just by making a list of errors to catch and doing it in pretty simple way (link), my preference would be to make our testing line up closer with how they do it, rather than developing a completely new system like this. That approach means there is nothing extra in the environment files and the warnings and such all is contained in the test runner files ( If a given user wants to hard code things into their env in order to ignore certain warnings they can, but I think by default the idea is that we want the libraries to err on the side of warning too much, rather than warning too little--they are just warnings which can be ignored if they aren't relevant to you. Having these sorts of things hard coded in the API test or seed test is not the best design (I believe there is some of this in PettingZoo's repo, don't see anything in Gymnasium which is good) There's a lot of quality of life things that Gymnasium does that I feel like would be more useful to have for new users, as compared to things like this which IMO is just an extra level of abstraction to learn and potentially get confused by. For example this file also has some warning catching as well, and has helpful links to the Gymnasium environment creation tutorial https://github.com/Farama-Foundation/Gymnasium/blob/81151aba1ee45df2327a01dfebcdede140aac0fe/tests/utils/test_env_checker.py#L251 I've talked with Jet about it as well but IMO the biggest issue for new users is that there are two environment creation tutorials (one in the tutorials section and one in the general documentation section: https://pettingzoo.farama.org/content/environment_creation/ and https://pettingzoo.farama.org/tutorials/custom_environment/, I tried to rename them and be a bit more clear, but still could use work). If these were cleaned up and turned into more full examples (or Gymnasium even has a full repo which is a 'skeleton repo' https://github.com/Farama-Foundation/gymnasium-env-template) and linked in the error codes and stuff like that, I think that would be super useful as well, just as another example. |
I'll respond to the rest later, but a quick clarification. Users of pettingzoo envs will see no changes.
All the rest is internal changes to the test suite. |
re: gymnasium handling. Looks like they ignore all warnings of a specific type. That goes against erring on the side of warning too much. I'd think a warning should be issued unless the env author has explicitly said it's by design. This approach adds a standard way for authors to do so. I disagree with the test code tracking which warnings to ignore for each env (the current approach) for the reasons given in the motivation above. As a concrete example, say I want to change knights_archers_zombies so zombies act differently: Original, no issues:
Renamed, but otherwise identical, code gives problems:
Should I get different results from identical code? That's not a good experience and will baffle a new env author. I agree that there are other QOL items that can be improved. And I agree it'd be nice to avoid putting test items in the env file. EDIT: I modified the 'alternatives' section in the original proposal to add an example of how this could be placed in the metadata. It requires no imports in env files, requires less code to use, and is possibly less intrusive to an env. |
Proposal
The test suite throws warnings for env behaviours that are generally frowned upon, but are expected and correct behaviour for some envs. I want to give envs the ability to include information on which warnings to suppress for that env. The test suite will know from the env what is an intentional part of the design so it won't complain about it.
This will make the api test more useful to env developers.
It will also document that a non-standard behaviour is intentional.
This would be a completely optional feature. All envs would work as is without change.
Motivation
Many included envs raise warnings during pytest despite correct behaviour. Unless these are suppressed, the test will always complain about those envs. Currently that suppression is centrally tracked via hard-coded lists in the test suite.
This is an issue for a couple reasons:
Pitch
There are 3 parts to this:
These will be used in place of the current generic user warnings. The warning given by the test code for this class:
to set filters to ignore warnings that the env is expected to throw.
env.set_pz_test_filters()
, if it exists, to set filters.This will add the filters from the env and reset them after this test.
The erroneous warnings will be ignored.
Alternatives
Leave the centrally located suppression lists for repo envs, and leave users to suppress warnings on their own in their custom envs.
pros: easier
cons: repo envs and custom envs behave differently. This puts an added barrier to a new env developer. (see also motivation section for other cons)
Put warning filters at the top of the env files so they are always active.
pros: prevents adding a function to the env code. avoids the need for item 3 in pitch section.
cons: always applies filters even during non-tests. This is a global operation which will trigger a bug where the wrong filters are used if more than one env is loaded in the same session (plausible/likely during development; i think it also happens during a couple tests).
Instead of the function (item 2 of pitch), put the names of the warnings to ignore in the metadata (see example below). All warnings in this list will be ignored by the test suite.
pros: cleaner in the sense that it prevents extra executable code. prevents additional imports in envs
cons: it is less flexible and might be limiting if there is a custom filter a developer wants to add (maybe this is ok?)
In this approach, an env author would only need to add the following to the metadata to suppress a warning - no imports and no extra functions needed:
pros: don't need to define new warnings
cons: if the warning text changes, these break. Also, a too general regex can incorrectly catch other warnings, especially when the messages are similar.
Additional context
I can make these changes, but there was some pushback to the idea on discord so i want to reach a consensus on the best approach to addressing the problems described in the motivation section.
Checklist
The text was updated successfully, but these errors were encountered: