v0.4.0
In this release we add:
Hierarchical policy support 📶
We have improved our support of hierarchical agent policies via the addition of a SequentialDistr
class. This class allows for multi-stage hierarchical policy distributions. There are two common settings where this can be very useful. (1) You have an agent who needs to choose a high level objective before choosing a low-level action (e.g. maybe it wants to "MoveAhead"
if it's high level goal is to explore but wants to "Crouch"
if its goal is to hide). This is naturally modeled as a hierarchical policy where the agent first samples it's objective and then samples a low-level action conditional on this objective. (2) You have an conditional low-level action space, e.g. perhaps your agent needs to specify x,y
coordinates when taking a "PickupObject"
action but doesn't need such coordinates when taking a "MoveAhead"
action.
We also include native support for hierarchical policies with Categorical sub-policies.
Misc improvements 📚
We also include several smaller enhancements:
- Early stopping criteria for non-distributed training - It can be useful to automatically stop training when some success criterion is met (e.g. training reward has saturated). This is now enabled (for non-distributed training) via the
early_stopping_criteria
parameter of thePipelineStage
class. - Better error (and keyboard interrupt) handing - Depending on the complexity of a training task, AllenAct may start a large number of parallel processes. We are now more consistent about how we handle exit signals so that processs are less likely to remain alive after a kill signal is sent.
Backwards incompatible changes 💔
- Several vision sensors have moved from
allenact.base_abstractions.sensor
toallenact.embodiedai.sensors.vision_sensors
. This change improves consistency of class location.