Ray 1.1.0

Ray Core

🎉 New Features:

Progress towards supporting a Ray client
Descendent tasks are cancelled when the calling task is cancelled

🔨 Fixes:

Improved object broadcast robustness
Improved placement group support

🏗 Architecture refactoring:

Progress towards the new scheduler backend

RLlib

🎉 New Features:

SUMO simulator integration (rllib/examples/simulators/sumo/). Huge thanks to Lara Codeca! (#11710)
SlateQ Algorithm added for PyTorch. Huge thanks to Henry Chen! (#11450)
MAML extension for all Models, except recurrent ones. (#11337)
Curiosity Exploration Module for tf1.x/2.x/eager. (#11945)
Minimal JAXModelV2 example. (#12502)

🔨 Fixes:

Fix RNN learning for tf2.x/eager. (#11720)
LSTM prev-action/prev-reward settable separately and prev-actions are now one-hot’d. (#12397)
PyTorch LR schedule not working. (#12396)
Various PyTorch GPU bug fixes. (#11609)
SAC loss not using prio. replay weights in critic’s loss term. (#12394)
Fix epsilon-greedy Exploration for nested action spaces. (#11453)

🏗 Architecture refactoring:

Trajectory View API on by default (faster PG-type algos by ~20% (e.g. PPO on Atari)). (#11717, #11826, #11747, and #11827)

Tune

🎉 New Features:

Loggers can now be passed as objects to tune.run. The new ExperimentLogger abstraction was introduced for all loggers, making it much easier to configure logging behavior. (#11984, #11746, #11748, #11749)
The tune verbosity was refactored into four levels: 0: Silent, 1: Only experiment-level logs, 2: General trial-level logs, 3: Detailed trial-level logs (default) (#11767, #12132, #12571)
Docker and Kubernetes autoscaling environments are detected automatically, automatically utilizing the correct checkpoint/log syncing tools (#12108)
Trainables can now easily leverage Tensorflow DistributedStrategy! (#11876)

💫 Enhancements

Introduced a new serialization debugging utility (#12142)
Added a new lightweight Pytorch-lightning example (#11497, #11585)
The BOHB search algorithm can be seeded with a random state (#12160)
The default anonymous metrics can be used automatically if a mode is set in tune.run (#12159).
Added HDFS as Cloud Sync Client (#11524)
Added xgboost_ray integration (#12572)
Tune search spaces can now be passed to search algorithms on initialization, not only via tune.run (#11503)
Refactored and added examples (#11931)
Callable accepted for register_env (#12618)
Tune search algorithms can handle/ignore infinite and NaN numbers (#11835)
Improved scalability for experiment checkpointing (#12064)
Nevergrad now supports points_to_evaluate (#12207)
Placement group support for distributed training (#11934)

🔨 Fixes:

Fixed with_parameters behavior to avoid serializing large data in scope (#12522)
TBX logger supports None (#12262)
Better error when metric or mode unset in search algorithms (#11646)
Better warnings/exceptions for fail_fast='raise' (#11842)
Removed some bottlenecks in trialrunner (#12476)
Fix file descriptor leak by syncer and Tensorboard (#12590, #12425)
Fixed validation for search metrics (#11583)
Fixed hyperopt randint limits (#11946)

Serve

🎉 New Features:

You can start backends in different conda environments! See more in the dependency management doc. (#11743)
You can add a optional reconfigure method to your Servable to allow reconfiguring backend replicas at runtime. (#11709)

🔨Fixes:

Set serve.start(http_host=None) to disable HTTP servers. If you are only using ServeHandle, this option lowers resource usage. (#11627)
Flask requests will no longer create reference cycles. This means peak memory usage should be lower for high traffic scenarios. (#12560)

🏗 Architecture refactoring:

Progress towards a goal state driven Serve controller. (#12369,#11792,#12211,#12275,#11533,#11822,#11579,#12281)
Progress towards faster and more efficient ServeHandles. (#11905, #12019, #12093)

Ray Cluster Launcher (Autoscaler)

🎉 New Features:

A new Kubernetes operator: https://docs.ray.io/en/master/cluster/k8s-operator.html

💫 Enhancements

Containers do not run with root user as the default (#11407)
SHM-Size is auto-populated when using the containers (#11953)

🔨 Fixes:

Many autoscaler bug fixes (#11677, #12222, #11458, #11896, #12123, #11820, #12513, #11714, #12512, #11758, #11615, #12106, #11961, #11674, #12028, #12020, #12316, #11802, #12131, #11543, #11517, #11777, #11810, #11751, #12465, #11422)

SGD

🎉 New Features:

Easily customize your torch.DistributedDataParallel configurations by passing in a ddp_args field into TrainingOperator.register (#11771).

🔨 Fixes:

TorchTrainer now properly scales up to more workers if more resources become available (#12562)

📖 Documentation:

The new callback API for using Ray SGD with Tune is now documented (#11479)
Pytorch Lightning + Ray SGD integration is now documented (#12440)

Dashboard

🔨 Fixes:

Fixed bug that prevented viewing the logs for cluster workers
Fixed bug that caused "Logical View" page to crash when opening a list of actors for a given class.

🏗 Architecture refactoring:

Dashboard runs on a new backend architecture that is more scalable and well-tested. The dashboard should work on ~100 node clusters now, and we're working on lifting scalability to constraints to support even larger clusters.

Thanks

Many thanks to all those who contributed to this release:
@bartbroere, @SongGuyang, @gramhagen, @richardliaw, @ConeyLiu, @weepingwillowben, @zhongchun, @ericl, @dHannasch, @timurlenk07, @kaushikb11, @krfricke, @desktable, @bcahlit, @rkooo567, @amogkam, @micahtyong, @edoakes, @stephanie-wang, @clay4444, @ffbin, @mfitton, @barakmich, @pcmoritz, @AmeerHajAli, @DmitriGekhtman, @iamhatesz, @raulchen, @ingambe, @allenyin55, @sven1977, @huyz-git, @yutaizhou, @suquark, @ashione, @simon-mo, @raoul-khour-ts, @Leemoonsoo, @maximsmol, @alanwguo, @kishansagathiya, @wuisawesome, @acxz, @gabrieleoliaro, @clarkzinzow, @jparkerholder, @kingsleykuan, @InnovativeInventor, @ijrsvt, @lasagnaphil, @lcodeca, @jiajiexiao, @heng2j, @wumuzi520, @mvindiola1, @aaronhmiller, @robertnishihara, @WangTaoTheTonic, @chaokunyang, @nikitavemuri, @kfstorm, @roireshef, @fyrestone, @viotemp1, @yncxcw, @karstenddwx, @hartikainen, @sumanthratna, @architkulkarni, @michaelzhiluo, @UWFrankGu, @oliverhu, @danuo, @lixin-wei

ray-project/ray ray-1.1.0 on GitHub

Ray 1.1.0

Ray Core

🎉 New Features:

🔨 Fixes:

🏗 Architecture refactoring:

RLlib

🎉 New Features:

🔨 Fixes:

🏗 Architecture refactoring:

Tune

🎉 New Features:

💫 Enhancements

🔨 Fixes:

Serve

🎉 New Features:

🔨Fixes:

🏗 Architecture refactoring:

Ray Cluster Launcher (Autoscaler)

🎉 New Features:

💫 Enhancements

🔨 Fixes:

SGD

🎉 New Features:

🔨 Fixes:

📖 Documentation:

Dashboard

🔨 Fixes:

🏗 Architecture refactoring:

Thanks

ray-project/ray ray-1.1.0
on GitHub