-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] 'layers' in 'policy_kwargs' does not functional for cnn features in general, but it works in some implementation like DDPG, SAC #526
Comments
You are right: The SAC and DDPG codes run inputs through @araffin Was there any special reason for doing it this for DDPG/SAC, in case there is a hidden "gotcha" somewhere? |
We are not sure of the intension. So let the code as original. And just be careful when using CNN feature extraction!
I don't think so. I created SAC/TD3 policy mimicking DDPG behavior, but never really used the code as normally you don't use image with DDPG/SAC/TD3. So yes, I think we should document or fix that anyway.
For the flexible mlp, it is already documented that is works only for vector observations (and not images). |
Thanks for the quick answers! |
You can leave it open for now until the documentation is fixed. If you feel like, you can suggest a PR to fix the documentation to address this issue :) |
@araffin @Miffyli |
Should be fixed in SB3: https://github.com/DLR-RM/stable-baselines3 |
As you see in the code of DQN policy below
stable-baselines/stable_baselines/deepq/policies.py
Lines 103 to 117 in 22c0753
layers after extracted_features does not appended for CNN features.
I can understand this as forcing to create full network in the custom cnn_extractor.
But some other policy implementation like DDPG, SAC, you are appending layers in both CNN and MLP cases. You can see this of SAC policy in the code below .
stable-baselines/stable_baselines/sac/policies.py
Lines 205 to 216 in 22c0753
Is this inconsistency of using 'layers' argument is intended?
Should we be careful about this argument for case-by-case?
Please shed some light on me.
Thanks.
The text was updated successfully, but these errors were encountered: