Unity-Technologies/ml-agents 0.4.0 on GitHub

Environments

To learn more about new and improved environments, see our Example Environments page.

Walker - Humanoid physics based agent. The agents must move its body toward the goal direction as quickly as possible without falling.
Pyramids - Sparse reward environment. The agent must press a button, then topple a pyramid of blocks to get the golden brick at the top. Used to demonstrate Curiosity.

Revamped the Crawler environment
Added visual observation based scenes for :
- BananaCollector
- PushBlock
- Hallway
- Pyramids
Added Imitation Learning based scenes for :
- Tennis
- Bouncer
- PushBlock
- Hallway
- Pyramids

[Unity] In Editor Training - It is now possible to train agents directly in the editor without building the scene. For more information, see here.
[Training] Curiosity-Driven Exploration - Addition of curiosity-based intrinsic reward signal when using PPO. Enable by setting use_curiosity brain training hyperparameter to true.
[Unity] Support for providing player input using axes within the Player Brain.
[Unity] TensorFlowSharp Plugin has been upgraded to version 1.7.1.

Main ML-Agents code now within MLAgents namespace. Ensure that the MLAgents namespace is added to necessary project scripts such as Agent classes.
ASCII art added to learn.py script.
Communication now uses gRPC and Protobuf. JSON libraries removed.
TensorBoard now reports mean absolute loss as opposed to total loss update loop.
PPO algorithm now uses wider gaussian output for Continuous Control models (increasing performance).

Curiosity-driven exploration does not function with On-Demand Decision Making. Expect a fix in v0.4.0a.

Thanks to everyone at Unity who contributed to v0.4, as well as: @sterlingcrispin, @ChrisRisner, @akmadian, @animaleja32, @LeighS, and @5665tm.