Highlights
- Pyarrow is no longer vendored. Ray directly uses the C++ Arrow API. You can use any version of pyarrow with ray. (#7233)
- The dashboard is turned on by default. It shows node and process information, actor information, and Ray Tune trials information. You can also use
ray.show_in_webui
to display custom messages for actors. Please try it out and send us feedback! (#6705, #6820, #6822, #6911, #6932, #6955, #7028, #7034) - We have made progress on distributed reference counting (behind a feature flag). You can try it out with
ray.init(_internal_config=json.dumps({"distributed_ref_counting_enabled": 1}))
. It is designed to help manage memory using precise distributed garbage collection. (#6945, #6946, #7029, #7075, #7218, #7220, #7222, #7235, #7249)
Breaking changes
- Many experimental Ray libraries are moved to the util namespace. (#7100)
ray.experimental.multiprocessing
=>ray.util.multiprocessing
ray.experimental.joblib
=>ray.util.joblib
ray.experimental.iter
=>ray.util.iter
ray.experimental.serve
=>ray.serve
ray.experimental.sgd
=>ray.util.sgd
- Tasks and actors are cleaned up if their owner process dies. (#6818)
- The
OMP_NUM_THREADS
environment variable defaults to 1 if unset. This improves training performance and reduces resource contention. (#6998) - We now vendor
psutil
andsetproctitle
to support turning the dashboard on by default. Runningimport psutil
afterimport ray
will use the version of psutil that ships with Ray. (#7031)
Core
- The Python raylet client is removed. All raylet communication now goes through the core worker. (#6018)
- Calling
delete()
will not delete objects in the in-memory store. (#7117) - Removed vanilla pickle serialization for task arguments. (#6948)
- Fix bug passing empty bytes into Python tasks. (#7045)
- Progress toward next generation ray scheduler. (#6913)
- Progress toward service based global control store (GCS). (#6686, #7041)
RLlib
- Improved PyTorch support, including a PyTorch version of PPO. (#6826, #6770)
- Added distributed SGD for PPO. (#6918, #7084)
- Added an exploration API for controlling epsilon greedy and stochastic exploration. (#6974, #7155)
- Fixed schedule values going negative past the end of the schedule. (#6971, #6973)
- Added support for histogram outputs in TensorBoard. (#6942)
- Added support for parallel and customizable evaluation step. (#6981)
Tune
- Improved Ax Example. (#7012)
- Process saves asynchronously. (#6912)
- Default to tensorboardx and include it in requirements. (#6836)
- Added experiment stopping api. (#6886)
- Expose progress reporter to users. (#6915)
- Fix directory naming regression. (#6839)
- Handles nan case for asynchyperband. (#6916)
- Prevent memory checkpoints from breaking trial fault tolerance. (#6691)
- Remove keras dependency. (#6827)
- Remove unused tf loggers. (#7090)
- Set correct path when deleting checkpoint folder. (#6758)
- Support callable objects in variant generation. (#6849)
Autoscaler
- Ray nodes now respect docker limits. (#7039)
- Add
--all-nodes
option to rsync-up. (#7065) - Add port-forwarding support for attach. (#7145)
- For AWS, default to latest deep learning AMI. (#6922)
- Added 'ray dashboard' command to proxy ray dashboard in remote machine. (#6959)
Utility libraries
- Support of scikit-learn with Ray joblib backend. (#6925)
- Parallel iterator support local shuffle. (#6921)
- [Serve] support no http headless services. (#7010)
- [Serve] refactor router to use Ray asyncio support. (#6873)
- [Serve] support composing arbitrary dags. (#7015)
- [RaySGD] support fp16 via PyTorch apex. (#7061)
- [RaySGD] refactor PyTorch sgd documentation. (#6910)
- Improvement in Ray Streaming. (#7043, #6666, #7071)
Other improvements
- Progress toward Windows compatibility. (#6882, #6823)
- Ray Kubernetes operator improvements. (#6852, #6851, #7091)
- Java support for concurrent actor calls API. (#7022)
- Java support for direct call for normal tasks. (#7193)
- Java support for cross language Python invocation. (#6709)
- Java support for cross language serialization for actor handles. (#7134)
Known issue
- Passing the same ObjectIDs multiple time as arguments currently doesn't work. (#7296)
- Tasks can exceed gRPC max message size. (#7263)
Thanks
We thank the following contributors for their work on this release:
@mitchellstern, @hugwi, @deanwampler, @alindkhare, @ericl, @ashione, @fyrestone, @robertnishihara, @pcmoritz, @richardliaw, @yutaizhou, @istoica, @edoakes, @ls-daniel, @BalaBalaYi, @raulchen, @justinkterry, @roireshef, @elpollouk, @kfstorm, @Bassstring, @hhbyyh, @Qstar, @mehrdadn, @chaokunyang, @flying-mojo, @ujvl, @AnanthHari, @rkooo567, @simon-mo, @jovany-wang, @ijrsvt, @ffbin, @AmeerHajAli, @gaocegege, @suquark, @MissiontoMars, @zzyunzhi, @sven1977, @stephanie-wang, @amogkam, @wuisawesome, @aannadi, @maximsmol