Skip to content

Wrappers

Christian Kauten edited this page Jul 22, 2018 · 20 revisions

nes-py includes a full collection of OpenAI Gym wrappers for standard functionality. These functions are located in the nes_py.wrappers module. Functionality covered includes:

There's also a single wrap method for combining them together in a streamlined fashion.

BinarySpaceToDiscreteSpaceEnv

A wrapper to define a much smaller action space. The initializer takes an environment and a list of button lists:

from nes_py.wrappers import BinarySpaceToDiscreteSpaceEnv
env = BinarySpaceToDiscreteSpaceEnv(env, [
    ['NOP'],
    ['right', 'A'],
    ['left', 'A'],
    ['A'],
])

Each button list should contain at least one of:

  • 'right'
  • 'left'
  • 'down'
  • 'up'
  • 'start'
  • 'select'
  • 'B'
  • 'A'
  • 'NOP'

Note

It is recommended for custom environments to use this wrapper by default. If there are multiple mappings, the package should provide each list and either register a different environment for each, or make note in the documentation for end users. See gym-super-mario-bros for an example of multiple button lists in the actions.py module.

ClipRewardEnv

A wrapper to clip rewards based on their sign. i.e.

  • reward = 1 if reward > 0
  • reward = -1 if reward < 0
  • reward = 0 if reward == 0
from nes_py.wrappers import ClipRewardEnv
env = ClipRewardEnv(env)

DownsampleEnv

A wrapper to convert RGB frames to Y-channel (Black & White) and resize them. It takes an environment and an image size as a tuple of height, width. By default, the image size is set to 84, 84.

from nes_py.wrappers import DownsampleEnv
env = DownsampleEnv(env, (100, 120))

FrameStackEnv

A wrapper to stack the previous k frames into an image tensor for each step. It takes an environment and a parameter k as the number of frames to stack.

from nes_py.wrappers import FrameStackEnv
env = FrameStackEnv(env, 4)

NormalizeRewardEnv

A wrapper to normalize reward about the environment's L-infinity norm. This wrapper only works for environments that have a finite reward_range variable (e.g. reward_range = (-15, 15)). It divides rewards about the value in the reward range with the largest absolute value.

from nes_py.wrappers import NormalizeRewardEnv
env = NormalizeRewardEnv(env)

PenalizeDeathEnv

A wrapper to penalize the agent for terminating episode (i.e. set the reward to a custom penalty when done is True). It takes an environment and a penalty value in its constructor.

from nes_py.wrappers import PenalizeDeathEnv
env = PenalizeDeathEnv(env, -15)

RewardCacheEnv

A wrapper to cache cumulative episode rewards in a list for analysis later. This wrapper is useful to keep track of un-mutated (raw) rewards from the environment. It cumulates the reward over an episode, then caches the value in a list.

from nes_py.wrappers import RewardCacheEnv
env = RewardCacheEnv(env)

Note

The wrapper will add a parameter episode_rewards to the env and append the cumulative reward at the end of each episode.

Note

This wrapper should be the first wrapper used before any other reward mutators if you wish to observe the raw values.

wrap

The wrap method provides a quick way to wrap an environment using any or all of the above wrappers, with exception of the BinarySpaceToDiscreteSpaceEnv, which is special use. It takes the following parameters with default values:

Parameter Default Description
cache_rewards True whether to use the reward cache wrapper
image_size (84,84) the image size for the downsample wrapper (None will disable it)
death_penalty -15 the value for the death penalty wrapper (None will disable it)
clip_rewards False whether to use the clip rewards wrapper
normalize_rewards False whether to use the normalize rewards wrapper
agent_history_length 4 the value for the frame stack wrapper (None will disable it)
from nes_py.wrappers import wrap
env = wrap(env,
    cache_rewards=False,
    image_size=(120, 100),
    death_penalty=None,
    clip_rewards=True,
    normalize_rewards=False,
    agent_history_length=3,
)
Clone this wiki locally