Spaces#

class gym.spaces.Space(shape: Optional[Sequence[int]] = None, dtype: Optional[Type | str] = None, seed: Optional[int] = None)#

Defines the observation and action spaces, so you can write generic code that applies to any Env. For example, you can choose a random action.

WARNING - Custom observation & action spaces can inherit from the Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (Tuple & Dict). Note that parametrized probability distributions (through the sample() method), and batching functions (in gym.vector.VectorEnv), are only well-defined for instances of spaces provided in gym by default. Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.

General Functions#

Each space implements the following functions:

gym.spaces.Space.sample(self) → T_cov#: Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.

gym.spaces.Space.contains(self, x) → bool#: Return boolean specifying if x is a valid member of this space

property Space.shape: Optional[tuple[int, ...]]#: Return the shape of the space as an immutable property

property gym.spaces.Space.dtype#: Return the data type of this space.

gym.spaces.Space.seed(self, seed: Optional[int] = None) → list#: Seed the PRNG of this space.

gym.spaces.Space.to_jsonable(self, sample_n: Sequence[T_cov]) → list#: Convert a batch of samples from this space to a JSONable data type.

gym.spaces.Space.from_jsonable(self, sample_n: list) → list[T_cov]#: Convert a JSONable data type to a batch of samples from this space.

Box#

class gym.spaces.Box(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[int] = None)#

A (possibly unbounded) box in R^n. Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of [a, b], (-oo, b], [a, oo), or (-oo, oo).

There are two common use cases:

Identical bound for each dimension::

>>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32)
Box(3, 4)

Independent bound for each dimension::

>>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32)
Box(2,)

__init__(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[int] = None)#

is_bounded(manner: str = 'both') → bool#

sample() → ndarray#

Generates a single random sample inside of the Box.

In creating a sample of the box, each coordinate is sampled according to the form of the interval:

[a, b] : uniform distribution
[a, oo) : shifted exponential distribution
(-oo, b] : shifted negative exponential distribution
(-oo, oo) : normal distribution

Discrete#

class gym.spaces.Discrete(n: int, seed: Optional[int] = None, start: int = 0)#

A discrete space in \(\{ 0, 1, \\dots, n-1 \}\).

A start value can be optionally specified to shift the range to \(\{ a, a+1, \\dots, a+n-1 \}\).

Example:

>>> Discrete(2)            # {0, 1}
>>> Discrete(3, start=-1)  # {-1, 0, 1}

class __init__(*args, **kwargs)#: Initialize self. See help(type(self)) for accurate signature.

MultiBinary#

class gym.spaces.MultiBinary(n: Union[ndarray, Sequence[int], int], seed: Optional[int] = None)#

An n-shape binary space.

The argument to MultiBinary defines n, which could be a number or a list of numbers.

Example Usage:

>> self.observation_space = spaces.MultiBinary(5)

>> self.observation_space.sample()

array([0, 1, 0, 1, 0], dtype=int8)

>> self.observation_space = spaces.MultiBinary([3, 2])

>> self.observation_space.sample()

array([[0, 0],
[0, 1], [1, 1]], dtype=int8)

MultiDiscrete#

class gym.spaces.MultiDiscrete(nvec: list[int], dtype=<class 'numpy.int64'>, seed=None)#

The multi-discrete action space consists of a series of discrete action spaces with different number of actions in each
It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space
It is parametrized by passing an array of positive integers specifying number of actions for each discrete action space

Note: Some environment wrappers assume a value of 0 always represents the NOOP action.

e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:

Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4

Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

Can be initialized as

MultiDiscrete([ 5, 2, 2 ])

__init__(nvec: list[int], dtype=<class 'numpy.int64'>, seed=None)#: nvec: vector of counts of each categorical variable

Dict#

class gym.spaces.Dict(spaces: dict[str, Space] | None = None, seed: dict | int | None = None, **spaces_kwargs: Space)#

A dictionary of simpler spaces.

Example usage: self.observation_space = spaces.Dict({“position”: spaces.Discrete(2), “velocity”: spaces.Discrete(3)})

Example usage [nested]: self.nested_observation_space = spaces.Dict({

‘sensors’: spaces.Dict({
‘position’: spaces.Box(low=-100, high=100, shape=(3,)), ‘velocity’: spaces.Box(low=-1, high=1, shape=(3,)), ‘front_cam’: spaces.Tuple((

spaces.Box(low=0, high=1, shape=(10, 10, 3)), spaces.Box(low=0, high=1, shape=(10, 10, 3))

)), ‘rear_cam’: spaces.Box(low=0, high=1, shape=(10, 10, 3)),

}), ‘ext_controller’: spaces.MultiDiscrete((5, 2, 2)), ‘inner_state’:spaces.Dict({

‘charge’: spaces.Discrete(100), ‘system_checks’: spaces.MultiBinary(10), ‘job_status’: spaces.Dict({

‘task’: spaces.Discrete(5), ‘progress’: spaces.Box(low=0, high=100, shape=()),

})

})

})

__init__(spaces: dict[str, Space] | None = None, seed: dict | int | None = None, **spaces_kwargs: Space)#

Tuple#

class gym.spaces.Tuple(spaces: Iterable[Space], seed: Optional[Union[int, List[int]]] = None)#

A tuple (i.e., product) of simpler spaces

Example usage: self.observation_space = spaces.Tuple((spaces.Discrete(2), spaces.Discrete(3)))

__init__(spaces: Iterable[Space], seed: Optional[Union[int, List[int]]] = None)#

Utility Functions#

gym.spaces.utils.flatdim(space: Space) → int#

gym.spaces.utils.flatdim(space: Union[Box, MultiBinary]) → int

gym.spaces.utils.flatdim(space: Discrete) → int

gym.spaces.utils.flatdim(space: MultiDiscrete) → int

gym.spaces.utils.flatdim(space: Tuple) → int

gym.spaces.utils.flatdim(space: Dict) → int

Return the number of dimensions a flattened equivalent of this space would have.

Accepts a space and returns an integer. Raises NotImplementedError if the space is not defined in gym.spaces.

gym.spaces.utils.flatten_space(space: Space) → Box#

gym.spaces.utils.flatten_space(space: Box) → Box

gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) → Box

gym.spaces.utils.flatten_space(space: Tuple) → Box

gym.spaces.utils.flatten_space(space: Dict) → Box

Flatten a space into a single Box.

This is equivalent to flatten(), but operates on the space itself. The result always is a Box with flat boundaries. The box has exactly flatdim(space) dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.

Raises NotImplementedError if the space is not defined in gym.spaces.

Example:

>>> box = Box(0.0, 1.0, shape=(3, 4, 5))
>>> box
Box(3, 4, 5)
>>> flatten_space(box)
Box(60,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that flattens a discrete space:

>>> discrete = Discrete(5)
>>> flatten_space(discrete)
Box(5,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that recursively flattens a dict:

>>> space = Dict({"position": Discrete(2),
...               "velocity": Box(0, 1, shape=(2, 2))})
>>> flatten_space(space)
Box(6,)
>>> flatten(space, space.sample()) in flatten_space(space)
True

gym.spaces.utils.flatten(space: Space[T], x: T) → ndarray#

gym.spaces.utils.flatten(space: MultiBinary, x) → ndarray

gym.spaces.utils.flatten(space: Box, x) → ndarray

gym.spaces.utils.flatten(space: Discrete, x) → ndarray

gym.spaces.utils.flatten(space: MultiDiscrete, x) → ndarray

gym.spaces.utils.flatten(space: Tuple, x) → ndarray

gym.spaces.utils.flatten(space: Dict, x) → ndarray

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Accepts a space and a point from that space. Always returns a 1D array. Raises NotImplementedError if the space is not defined in gym.spaces.

gym.spaces.utils.unflatten(space: Space[T], x: ndarray) → T#

gym.spaces.utils.unflatten(space: Box | MultiBinary, x: np.ndarray) → np.ndarray

gym.spaces.utils.unflatten(space: Discrete, x: ndarray) → int

gym.spaces.utils.unflatten(space: MultiDiscrete, x: ndarray) → ndarray

gym.spaces.utils.unflatten(space: Tuple, x: ndarray) → tuple

gym.spaces.utils.unflatten(space: Dict, x: ndarray) → dict

Unflatten a data point from a space.

This reverses the transformation applied by flatten(). You must ensure that the space argument is the same as for the flatten() call.

Accepts a space and a flattened point. Returns a point with a structure that matches the space. Raises NotImplementedError if the space is not defined in gym.spaces.