You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi - I noticed that the DRL_predict which is used when we take the trained model to trade out of sample calls the model.predict method without specifying deterministic=True. The default value of this argument in stable_baselines for model.predict is False. This means that at out of sample, we sample out of the policy distribution. I am wondering if you could provide rationale for this sampling as opposed to just use the distribution mode? Thank you very much.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi - I noticed that the DRL_predict which is used when we take the trained model to trade out of sample calls the model.predict method without specifying deterministic=True. The default value of this argument in stable_baselines for model.predict is False. This means that at out of sample, we sample out of the policy distribution. I am wondering if you could provide rationale for this sampling as opposed to just use the distribution mode? Thank you very much.
Beta Was this translation helpful? Give feedback.
All reactions