Farama-Foundation/Gymnasium v1.0.0a2 on GitHub

This is our second alpha version which we hope to be the last before the full Gymnasium v1.0.0 release. We summarise the key changes, bug fixes and new features added in this alpha version.

Key Changes

Atari environments

ale-py that provides the Atari environments has been updated in v0.9.0 to use Gymnasium as the API backend. Furthermore, the pip install contains the ROMs so all that should be necessary for installing Atari will be pip install “gymnasium[atari]” (as a result, gymnasium[accept-rom-license] has been removed). A reminder that for Gymnasium v1.0 to register the external environments (e.g., ale-py), you will be required to import ale_py before creating any of the Atari environments.

Collecting seeding values

It was possible to seed with both environments and spaces with None to use a random initial seed value however it wouldn’t be possible to know what these initial seed values were. We have addressed for this Space.seed and reset.seed in #1033 and #889. For Space.seed, we have changed the return type to be specialised for each space such that the following code will work for all spaces.

seeded_values = space.seed(None)
initial_samples = [space.sample() for _ in range(10)]

reseed_values = space.seed(seeded_values)
reseed_samples = [space.sample() for _ in range(10)]

assert seeded_values == reseed_values
assert initial_samples == reseed_samples

Additionally, for environments, we have added a new np_random_seed attribute that will store the most recent np_random seed value from reset(seed=seed).

Environment Version changes

It was discovered recently that the mujoco-based pusher was not compatible with MuJoCo >= 3 due to bug fixes that found the model density for a block that the agent had to push was the density of air. This obviously began to cause issues for users with MuJoCo v3+ and Pusher. Therefore, we are disabled the v4 environment with MuJoCo >= 3 and updated to the model in MuJoCo v5 that produces more expected behaviour like v4 and MuJoCo < 3 (#1019).
Alpha 2 includes new v5 MuJoCo environments as a follow-up to v4 environments added two years ago, fixing consistencies, adding new features and updating the documentation. We have decided to mark the MuJoCo-py (v2 and v3) environments as deprecated and plan to remove them from Gymnasium in future (#926).
Lunar Lander version increased from v2 to v3 due to two bug fixes. The first fixes the determinism of the environment such that the world object was not completely destroyed on reset causing non-determinism in particular cases (#979). Second, the wind generation (by default turned off) was not randomly generated by each reset, therefore, we have updated this to gain statistical independence between episodes (#959).

Box Samples

It was discovered that the spaces.Box would allow low and high values outside the dtype’s range (#774) which could result in some very strange edge cases that were very difficult to detect. We hope that these changes improve debugging and detecting invalid inputs to the space, however, let us know if your environment raises issues related to this.

Bug Fixes

Updates CartPoleVectorEnv for the new autoreset API (#915)
Fixed wrappers.vector.RecordEpisodeStatistics episode length computation from new autoreset api (#1018)
Remove mujoco-py import error for v4+ MuJoCo environments (#934)
Fix make_vec(**kwargs) not being passed to vector entry point envs (#952)
Fix reading shared memory for Tuple and Dict spaces (#941)
Fix Multidiscrete.from_jsonable for windows (#932)
Remove play rendering normalisation (#956)

New Features

Added Python 3.12 support
Add a new OneOf space that provides exclusive unions of spaces (#812)
Update Dict.sample to use standard Python dicts rather than OrderedDict due to dropping Python 3.7 support (#977)
Jax environment return jax data rather than numpy data (#817)
Add wrappers.vector.HumanRendering and remove human rendering from CartPoleVectorEnv (#1013)
Add more helpful error messages if users use a mixture of Gym and Gymnasium (#957)
Add sutton_barto_reward argument for CartPole that changes the reward function to not return 1 on terminating states (#958)
Add visual_options rendering argument for MuJoCo environments (#965)
Add exact argument to utlis.env_checker.data_equivilance (#924)
Update wrapper.NormalizeObservation observation space and change observation to float32 (#978)
Catch exception during env.spec if kwarg is unpickleable (#982)
Improving ImportError for Box2D (#1009)
Added metadata field to VectorEnv and VectorWrapper (#1006)
Fix make_vec for sync or async when modifying make arguments (#1027)

Full Changelog: v1.0.0a1...v1.0.0a2 v0.29.1...v1.0.0a2

Farama-Foundation/Gymnasium v1.0.0a2 v1.0.0 alpha 2 on GitHub