-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
environment? #13
Comments
No specific physical meaning, the purpose is to normalize the |
Similar to action normalization in continuous envs |
|
maybe, worth trying! Hope for better results. |
Thank you for your suggestion. I'll take a look at it later. I noticed that in your code, the states are not normalized to a range of [0, 1] or [-1, 1], but instead, they are divided by a value to make them fall within a different range, right? |
Due to their different meaning, yes. Actually, I didnot do many experiments here |
Sure, may I ask, if the states are simply converted from one range to another, what is the significance of normalize state? |
Great question!! To improve the efficiency and stability of learning, states & actions are suggested to be normalized, especially in some dynamic envs (eg mujoco). While to [0,1] or [-1,1],actions maybe a good choice, I think it is best not to normalize the state too much for better generalization (need experiments). |
I would like to know more about why you divide by these specific values when processing different states. What is the reasoning behind choosing these particular numbers?
Thank you very much for your reply; it has been very helpful. When you normalize the state, is it based on experience, like in the case of this: delta_yaw_t = np.array(self.state_info['delta_yaw_t']).reshape((1, )) / 2.0? Can I more about why you divide by these specific values when processing different states. What is the reasoning behind choosing these particular numbers? |
Hello, thank you for your work!
I would like to ask, when normalizing the state, what is the basis for setting the values? For example:
delta_yaw_t = np.array(self.state_info['delta_yaw_t']).reshape((1, )) / 2.0
dyaw_dt_t = np.array(self.state_info['dyaw_dt_t']).reshape((1, )) / 5.0
lateral_dist_t = self.state_info['lateral_dist_t'].reshape((1, )) * 10.0
action_last = self.state_info['action_t_1'] * 10.0
future_angles = self.state_info['angles_t'] / 2.0
What do the values you divide or multiply by represent? Your answer would be of great help to me. I look forward to your response.
The text was updated successfully, but these errors were encountered: