Spaces#
- class gym.spaces.Space(shape: Optional[Sequence[int]] = None, dtype: Optional[Type | str] = None, seed: Optional[int] = None)#
Defines the observation and action spaces, so you can write generic code that applies to any Env. For example, you can choose a random action.
WARNING - Custom observation & action spaces can inherit from the Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (Tuple & Dict). Note that parametrized probability distributions (through the sample() method), and batching functions (in gym.vector.VectorEnv), are only well-defined for instances of spaces provided in gym by default. Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.
General Functions#
Each space implements the following functions:
- gym.spaces.Space.sample(self) T_cov#
Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.
- gym.spaces.Space.contains(self, x) bool#
Return boolean specifying if x is a valid member of this space
- property Space.shape: Optional[tuple[int, ...]]#
Return the shape of the space as an immutable property
- property gym.spaces.Space.dtype#
Return the data type of this space.
- gym.spaces.Space.seed(self, seed: Optional[int] = None) list#
Seed the PRNG of this space.
- gym.spaces.Space.to_jsonable(self, sample_n: Sequence[T_cov]) list#
Convert a batch of samples from this space to a JSONable data type.
- gym.spaces.Space.from_jsonable(self, sample_n: list) list[T_cov]#
Convert a JSONable data type to a batch of samples from this space.
Box#
- class gym.spaces.Box(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[int] = None)#
A (possibly unbounded) box in R^n. Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of [a, b], (-oo, b], [a, oo), or (-oo, oo).
There are two common use cases:
- Identical bound for each dimension::
>>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32) Box(3, 4)
- Independent bound for each dimension::
>>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32) Box(2,)
- __init__(low: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], high: ~typing.Union[~typing.SupportsFloat, ~numpy.ndarray], shape: ~typing.Optional[~typing.Sequence[int]] = None, dtype: ~typing.Type = <class 'numpy.float32'>, seed: ~typing.Optional[int] = None)#
- is_bounded(manner: str = 'both') bool#
- sample() ndarray#
Generates a single random sample inside of the Box.
In creating a sample of the box, each coordinate is sampled according to the form of the interval:
[a, b] : uniform distribution
[a, oo) : shifted exponential distribution
(-oo, b] : shifted negative exponential distribution
(-oo, oo) : normal distribution
Discrete#
- class gym.spaces.Discrete(n: int, seed: Optional[int] = None, start: int = 0)#
A discrete space in \(\{ 0, 1, \\dots, n-1 \}\).
A start value can be optionally specified to shift the range to \(\{ a, a+1, \\dots, a+n-1 \}\).
Example:
>>> Discrete(2) # {0, 1} >>> Discrete(3, start=-1) # {-1, 0, 1}
- class __init__(*args, **kwargs)#
Initialize self. See help(type(self)) for accurate signature.
MultiBinary#
- class gym.spaces.MultiBinary(n: Union[ndarray, Sequence[int], int], seed: Optional[int] = None)#
An n-shape binary space.
The argument to MultiBinary defines n, which could be a number or a list of numbers.
Example Usage:
>> self.observation_space = spaces.MultiBinary(5)
>> self.observation_space.sample()
array([0, 1, 0, 1, 0], dtype=int8)
>> self.observation_space = spaces.MultiBinary([3, 2])
>> self.observation_space.sample()
- array([[0, 0],
[0, 1], [1, 1]], dtype=int8)
MultiDiscrete#
- class gym.spaces.MultiDiscrete(nvec: list[int], dtype=<class 'numpy.int64'>, seed=None)#
The multi-discrete action space consists of a series of discrete action spaces with different number of actions in each
It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space
It is parametrized by passing an array of positive integers specifying number of actions for each discrete action space
Note: Some environment wrappers assume a value of 0 always represents the NOOP action.
e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:
Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4
Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
Can be initialized as
MultiDiscrete([ 5, 2, 2 ])
- __init__(nvec: list[int], dtype=<class 'numpy.int64'>, seed=None)#
nvec: vector of counts of each categorical variable
Dict#
- class gym.spaces.Dict(spaces: dict[str, Space] | None = None, seed: dict | int | None = None, **spaces_kwargs: Space)#
A dictionary of simpler spaces.
Example usage: self.observation_space = spaces.Dict({“position”: spaces.Discrete(2), “velocity”: spaces.Discrete(3)})
Example usage [nested]: self.nested_observation_space = spaces.Dict({
- ‘sensors’: spaces.Dict({
‘position’: spaces.Box(low=-100, high=100, shape=(3,)), ‘velocity’: spaces.Box(low=-1, high=1, shape=(3,)), ‘front_cam’: spaces.Tuple((
spaces.Box(low=0, high=1, shape=(10, 10, 3)), spaces.Box(low=0, high=1, shape=(10, 10, 3))
)), ‘rear_cam’: spaces.Box(low=0, high=1, shape=(10, 10, 3)),
}), ‘ext_controller’: spaces.MultiDiscrete((5, 2, 2)), ‘inner_state’:spaces.Dict({
‘charge’: spaces.Discrete(100), ‘system_checks’: spaces.MultiBinary(10), ‘job_status’: spaces.Dict({
‘task’: spaces.Discrete(5), ‘progress’: spaces.Box(low=0, high=100, shape=()),
})
})
})
Tuple#
Utility Functions#
- gym.spaces.utils.flatdim(space: Space) int#
- gym.spaces.utils.flatdim(space: Union[Box, MultiBinary]) int
- gym.spaces.utils.flatdim(space: Union[Box, MultiBinary]) int
- gym.spaces.utils.flatdim(space: Discrete) int
- gym.spaces.utils.flatdim(space: MultiDiscrete) int
- gym.spaces.utils.flatdim(space: Tuple) int
- gym.spaces.utils.flatdim(space: Dict) int
Return the number of dimensions a flattened equivalent of this space would have.
Accepts a space and returns an integer. Raises
NotImplementedErrorif the space is not defined ingym.spaces.
- gym.spaces.utils.flatten_space(space: Space) Box#
- gym.spaces.utils.flatten_space(space: Box) Box
- gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
- gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
- gym.spaces.utils.flatten_space(space: Union[Discrete, MultiBinary, MultiDiscrete]) Box
- gym.spaces.utils.flatten_space(space: Tuple) Box
- gym.spaces.utils.flatten_space(space: Dict) Box
Flatten a space into a single
Box.This is equivalent to
flatten(), but operates on the space itself. The result always is a Box with flat boundaries. The box has exactlyflatdim(space)dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.Raises
NotImplementedErrorif the space is not defined ingym.spaces.Example:
>>> box = Box(0.0, 1.0, shape=(3, 4, 5)) >>> box Box(3, 4, 5) >>> flatten_space(box) Box(60,) >>> flatten(box, box.sample()) in flatten_space(box) True
Example that flattens a discrete space:
>>> discrete = Discrete(5) >>> flatten_space(discrete) Box(5,) >>> flatten(box, box.sample()) in flatten_space(box) True
Example that recursively flattens a dict:
>>> space = Dict({"position": Discrete(2), ... "velocity": Box(0, 1, shape=(2, 2))}) >>> flatten_space(space) Box(6,) >>> flatten(space, space.sample()) in flatten_space(space) True
- gym.spaces.utils.flatten(space: Space[T], x: T) ndarray#
- gym.spaces.utils.flatten(space: MultiBinary, x) ndarray
- gym.spaces.utils.flatten(space: Box, x) ndarray
- gym.spaces.utils.flatten(space: Discrete, x) ndarray
- gym.spaces.utils.flatten(space: MultiDiscrete, x) ndarray
- gym.spaces.utils.flatten(space: Tuple, x) ndarray
- gym.spaces.utils.flatten(space: Dict, x) ndarray
Flatten a data point from a space.
This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.
Accepts a space and a point from that space. Always returns a 1D array. Raises
NotImplementedErrorif the space is not defined ingym.spaces.
- gym.spaces.utils.unflatten(space: Space[T], x: ndarray) T#
- gym.spaces.utils.unflatten(space: Box | MultiBinary, x: np.ndarray) np.ndarray
- gym.spaces.utils.unflatten(space: Box | MultiBinary, x: np.ndarray) np.ndarray
- gym.spaces.utils.unflatten(space: Discrete, x: ndarray) int
- gym.spaces.utils.unflatten(space: MultiDiscrete, x: ndarray) ndarray
- gym.spaces.utils.unflatten(space: Tuple, x: ndarray) tuple
- gym.spaces.utils.unflatten(space: Dict, x: ndarray) dict
Unflatten a data point from a space.
This reverses the transformation applied by
flatten(). You must ensure that thespaceargument is the same as for theflatten()call.Accepts a space and a flattened point. Returns a point with a structure that matches the space. Raises
NotImplementedErrorif the space is not defined ingym.spaces.