Highlights
- Runtime Environments are ready for general use! This feature enables you to dynamically specify per-task, per-actor and per-job dependencies, including a working directory, environment variables, pip packages and conda environments. Install it with
pip install -U 'ray[default]'
. - Ray Dataset is now in alpha! Dataset is an interchange format for distributed datasets, powered by Arrow. You can also use it for a basic Ray native data processing experience. Check it out here.
- Ray Lightning v0.1 has been released! You can install it via
pip install ray-lightning
. Ray Lightning is a library of PyTorch Lightning plugins for distributed training using Ray. Features:- Enables quick and easy parallel training
- Supports PyTorch DDP, Horovod, and Sharded DDP with Fairscale
- Integrates with Ray Tune for hyperparameter optimization and is compatible with Ray Client
pip install ray
now has a significantly reduced set of dependencies. Features such as the dashboard, the cluster launcher, runtime environments, and observability metrics may requirepip install -U 'ray[default]'
to be enabled. Please report any issues on Github if this is an issue!
Ray Autoscaler
🎉 New Features:
- The Ray autoscaler now supports TPUs on GCP. Please refer to this example for spinning up a simple TPU cluster. (#17278)
💫Enhancements:
- Better AWS networking configurability (#17236 #17207 #14080)
- Support for running autoscaler without NodeUpdaters (#17194, #17328)
🔨 Fixes:
Ray Client
💫Enhancements:
- Updated docs for client server ports and ray.init(ray://) (#17003, #17333)
- Better error handling for deserialization failures (#17035)
🔨 Fixes:
- Fix for server proxy not working with non-default redis passwords (#16885)
Ray Core
🎉 New Features:
- Runtime Environments are ready for general use!
- Specify a working directory to upload your local files to all nodes in your cluster.
- Specify different conda and pip dependencies for your tasks and actors and have them installed on the fly.
🔨 Fixes:
- Fix plasma store bugs for better data processing stability (#16976, #17135, #17140, #17187, #17204, #17234, #17396, #17550)
- Fix a placement group bug where CUDA_VISIBLE_DEVICES were not properly detected (#17318)
- Improved Ray stacktrace messages. (#17389)
- Improved GCS stability and scalability (#17456, #17373, #17334, #17238, #17072)
🏗 Architecture refactoring:
Ray Data Processing
Ray Dataset is now in alpha! Dataset is an interchange format for distributed datasets, powered by Arrow. You can also use it for a basic Ray native data processing experience. Check it out here.
RLLib
🎉 New Features:
- Support for RNN/LSTM models with SAC (new agent: "RNNSAC"). Shoutout to ddworak94! (#16577)
- Support for ONNX model export (tf and torch). (#16805)
- Allow Policies to be added to/removed from a Trainer on-the-fly. (#17566)
🔨 Fixes:
-
Fix for view requirements captured during compute actions test pass. Shoutout to Chris Bamford (#15856)
-
Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use
ray.get_gpu_ids()
(b/c no GPUs assigned by ray). (#17444) -
Other bug fixes: #15709, #15911, #16083, #16716, #16744, #16896, #16999, #17010, #17014, #17118, #17160, #17315, #17321, #17335, #17341, #17356, #17460, #17543, #17567, #17587
🏗 Architecture refactoring:
- CV2 to Skimage dependency change (CV2 still supported). Shoutout to Vince Jankovics. (#16841)
- Unify tf and torch policies wrt. multi-GPU handling: PPO-torch is now 33% faster on Atari and 1 GPU. (#17371)
- Implement all policy maps inside RolloutWorkers to be LRU-caches so that a large number of policies can be added on-the-fly w/o running out of memory. (#17031)
- Move all tf static-graph code into DynamicTFPolicy, such that policies can be deleted and their tf-graph is GC'd. (#17169)
- Simplify multi-agent configs: In most cases, creating dummy envs (only to retrieve spaces) are no longer necessary. (#16565, #17046)
📖Documentation:
- Examples scripts do-over (shoutout to Stefan Schneider for this initiative).
- Example script: League-based self-play with "open spiel" env. (#17077)
- Other doc improvements: #15664 (shoutout to kk-55), #17030, #17530
Tune
🎉 New Features:
- Dynamic trial resource allocation with ResourceChangingScheduler (#16787)
- It is now possible to use a define-by-run function to generate a search space with OptunaSearcher (#17464)
💫Enhancements:
- String names of searchers/schedulers can now be used directly in tune.run (#17517)
- Filter placement group resources if not in use (progress reporting) (#16996)
- Add unit tests for flatten_dict (#17241)
🔨Fixes:
📖Documentation:
- LightGBM integration (#17304)
- Other documentation improvements: #17407 (shoutout to amavilla), #17441, #17539, #17503
SGD
🎉 New Features:
- We have started initial development on a new RaySGD v2! We will be rolling it out in a future version of Ray. See the documentation here. (#17536, #17623, #17357, #17330, #17532, #17440, #17447, #17300, #17253)
💫Enhancements:
- Placement Group support for TorchTrainer (#17037)
Serve
🎉 New Features:
- Add Ray API stability annotations to Serve, marking many
serve.\*
APIs asStable
(#17295) - Support
runtime_env
'sworking_dir
for Ray Serve (#16480)
🔨Fixes:
- Fix FastAPI's response_model not added to class based view routes (#17376)
- Replace
backend
withdeployment
in metrics & logging (#17434)
🏗Stability Enhancements:
- Run Ray Serve with multi & single deployment large scale (1K+ cores) test running nightly (#17310, #17411, #17368, #17026, #17277)
Thanks
Many thanks to all who contributed to this release:
@suquark, @xwjiang2010, @clarkzinzow, @kk-55, @mGalarnyk, @pdames, @Souphis, @edoakes, @sasha-s, @iycheng, @stephanie-wang, @antoine-galataud, @scv119, @ericl, @amogkam, @ckw017, @wuisawesome, @krfricke, @vakker, @qingyun-wu, @Yard1, @juliusfrost, @DmitriGekhtman, @clay4444, @mwtian, @corentinmarek, @matthewdeng, @simon-mo, @pcmoritz, @qicosmos, @architkulkarni, @rkooo567, @navneet066, @dependabot[bot], @jovany-wang, @kombuchafox, @thomasjpfan, @kimikuri, @Ivorforce, @franklsf95, @MissiontoMars, @lantian-xu, @duburcqa, @ddworak94, @ijrsvt, @sven1977, @kira-lin, @SongGuyang, @kfstorm, @Rohan138, @jamesmishra, @amavilla, @fyrestone, @lixin-wei, @stefanbschneider, @jiaodong, @richardliaw, @WangTaoTheTonic, @chenk008, @Catch-Bull, @Bam4d