-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Using the start value of Discrete spaces has no effect #2052
Comments
Hey @araffin , this is my first interaction with public git repositories. Could you help me and tell me, what I am missing on this issue, when you comment "You have checked the required items in the checklist but you didn't do whats written ...". |
Hello, a quick search in the issues will give you #913 and all the others #1295 #1378 We currently also have an explicit warning in our env checker: stable-baselines3/stable_baselines3/common/env_checker.py Lines 27 to 41 in 9caa168
I appreciate your motivation and that's also why we have a contributing guide to get you started: https://github.com/DLR-RM/stable-baselines3/blob/master/CONTRIBUTING.md Currently, as written in the other issues and in the env checker, there is a simple alternative which is to use a wrapper (or to update the env to do the offset there). |
sorry for not being clear enough, after looking again at the PR, I think it would be better to document how such wrapper would look like rather than having a more complex codebase. I guess we should add something like: class ShiftWrapper(gym.Wrapper):
"""Allow to use Discrete() action spaces with start!=0"""
def __init__(self, env: gym.Env) -> None:
super().__init__(env)
assert isinstance(env.action_space, gym.spaces.Discrete)
self.action_space = gym.spaces.Discrete(env.action_space.n, start=0)
def step(self, action: int):
return self.env.step(action + self.env.action_space.start) to our documentation, the same way we document how to use isaac sim with sb3. |
I am confident that everyone will perceive the current state of the discrete action spaces in the project as a bug. If users are required to implement the ShiftWrapper themselves, it will likely be viewed as a workaround. If we truly require such a ShiftWrapper, it should have already been implemented, rather than being added individually by each user. From my perspective, adding an additional wrapper does not improve readability, as its logic differs from the Box-Scaling logic. The Box already handles scaling, and the primary changes in my pull request are the "elif isinstance(Discrete, MultiDiscrete)" block. The logic here largely mirrors the initial "if" block, using "scale" and "unscale." Therefore, anyone who understands the first "if" statement should also be able to understand the second. In fact, they might even better understand the first "if" statement upon encountering the same logic applied to a different space. The handling of discrete action spaces in the scale and unscale methods also follows the same logic as the Box-scaling approach: transforming between action bounds and a standard range. |
SB3 intentionally doesn't support all features from Gymnasium.
It is a workaround, but a simple one. And it is not the only solution. I'm also sorry that you put work in code that might not be used at the end. |
🐛 Bug
For example spaces.Discrete(3, start=-1) does not produce actions in {-1, 0, 1} but rather in {0, 1, 2} in both the step function and the predict function.
To Reproduce
Code Sample: Enviroment with a Discrete Action starting not from zero
Relevant log output / Error message
System Info
Checklist
The text was updated successfully, but these errors were encountered: