Creating new environments is done over inheritance of the VSSS base class through the following steps:
- Setting the field type (0 for the 3 vs 3 competition field size, and 1 for 5 vs 5), the number of blue and yellow robots and the simulation time step.
- Defining the obsevation and action spaces as gym.Spaces types.
- Implement the base class methods:
- _get_commands
- _frame_to_observations
- _calculate_reward_and_done
- _get_initial_positions_frame
- Register the environment and set the maximum number of steps in an episode on init
The base environment define normalization methods using field size and robot parameters values which are used on the available environments.
In this environment each team has 3 robots, in which the id 0 blue robot is controlled through setting the desired wheel speed, while the other robots receive actions sampled from a Ornstein Uhlenbeck process. The episode ends when a goal occurs.
-
- Box(40,)
- Value Range: [-1.25, 1.25] (Normalized)
Index Observation 0 Ball X 1 Ball Y 2 Ball Vx 3 Ball Vy 4 + (7 * i) id i Blue Robot X 5 + (7 * i) id i Blue Robot Y 6 + (7 * i) id i Blue Robot sin(theta) 7 + (7 * i) id i Blue Robot cos(theta) 8 + (7 * i) id i Blue Robot Vx 9 + (7 * i) id i Blue Robot Vy 10 + (7 * i) id i Blue Robot v_theta 25 + (5 * i) id i Yellow Robot X 26 + (5 * i) id i Yellow Robot Y 27 + (5 * i) id i Yellow Robot Vx 28 + (5 * i) id i Yellow Robot Vy 29 + (5 * i) id i Yellow Robot v_theta -
- Box(2,)
- Value Range: [-1, 1]
Index Action 0 Wheel 0 speed 1 Wheel 1 speed -
- Move
- Ball potential gradient
- Energy
- Goal
-
When a goal happens
In this environment each team has 3 robots, in which the blue robots are controlled through setting the desired wheel speed, while the other robots receive actions sampled from a Ornstein Uhlenbeck process. The episode ends when a goal occurs.
-
- Box(3, 40)
- Value Range: [-1.25, 1.25] (Normalized)
Index Observation 0 Ball X 1 Ball Y 2 Ball Vx 3 Ball Vy 4 + (7 * i) id i Blue Robot X 5 + (7 * i) id i Blue Robot Y 6 + (7 * i) id i Blue Robot sin(theta) 7 + (7 * i) id i Blue Robot cos(theta) 8 + (7 * i) id i Blue Robot Vx 9 + (7 * i) id i Blue Robot Vy 10 + (7 * i) id i Blue Robot v_theta 25 + (5 * i) id i Yellow Robot X 26 + (5 * i) id i Yellow Robot Y 27 + (5 * i) id i Yellow Robot Vx 28 + (5 * i) id i Yellow Robot Vy 29 + (5 * i) id i Yellow Robot v_theta -
- Box(3, 2)
- Value Range: [-1, 1]
Index Action 0 Wheel 0 speed 1 Wheel 1 speed -
- Move (Individual)
- Ball potential gradient
- Energy (Individual)
- Goal
-
When a goal happens