Highlights
- Major update of RLlib docs and example scripts for the new API stack.
Ray Libraries
Ray Data
🎉 New Features:
- Expression support for filters (#49016)
- Support
partition_cols
inwrite_parquet
(#49411) - Feature: implement multi-directional sort over Ray Data datasets (#49281)
💫 Enhancements:
- Use dask 2022.10.2 (#48898)
- Clarify schema validation error (#48882)
- Raise
ValueError
when the data sort key isNone
(#48969) - Provide more messages when webdataset format is error (#48643)
- Upgrade Arrow version from 17 to 18 (#48448)
- Update
hudi
version to 0.2.0 (#48875) webdataset
: expand JSON objects into individual samples (#48673)- Support passing kwargs to map tasks. (#49208)
- Add
ExecutionCallback
interface (#49205) - Add seed for read files (#49129)
- Make
select_columns
andrename_columns
use Project operator (#49393)
🔨 Fixes:
- Fix partial function name parsing in
map_groups
(#48907) - Always launch one task for
read_sql
(#48923) - Reimplement of fix memory pandas (#48970)
webdataset
: flatten return args (#48674)- Handle
numpy > 2.0.0
behaviour in_create_possibly_ragged_ndarray
(#48064) - Fix
DataContext
sealing for multiple datasets. (#49096) - Fix
to_tf
forList
types (#49139) - Fix type mismatch error while mapping nullable column (#49405)
- Datasink: support passing write results to
on_write_completes
(#49251) - Fix
groupby
hang when value containsnp.nan
(#49420) - Fix bug where
file_extensions
doesn't work with compound extensions (#49244) - Fix map operator fusion when concurrency is set (#49573)
Ray Train
🎉 New Features:
- Output JSON structured log files for system and application logs (#49414)
- Add support for AMD ROCR_VISIBLE_DEVICES (#49346)
💫 Enhancements:
🏗 Architecture refactoring:
- LightGBM: Rewrite
get_network_params
implementation (#49019)
Ray Tune
🎉 New Features:
- Update
optuna_search
to allow users to configure optuna storage (#48547)
🏗 Architecture refactoring:
Ray Serve
💫 Enhancements:
- Improved request_id generation to reduce proxy CPU overhead (#49537)
- Tune GC threshold by default in proxy (#49720)
- Use
pickle.dumps
for faster serialization fromproxy
toreplica
(#49539)
🔨 Fixes:
- Handle nested ‘=’ in serve run arguments (#49719)
- Fix bug when
ray.init()
is called multiple times with differentruntime_envs
(#49074)
🗑️ Deprecations:
- Adds a warning that the default behavior for sync methods will change in a future release. They will be run in a threadpool by default. You can opt into this behavior early by setting
RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1
. (#48897)
RLlib
🎉 New Features:
- Add support for external Envs to new API stack: New example script and custom tcp-capable EnvRunner. (#49033)
💫 Enhancements:
- Offline RL:
- APPO/IMPALA acceleration (new API stack):
- Add support for
AggregatorActors
per Learner. (#49284) - Auto-sleep time AND thread-safety for MetricsLogger. (#48868)
- Activate APPO cont. actions release- and CI tests (HalfCheetah-v1 and Pendulum-v1 new in
tuned_examples
). (#49068) - Add "burn-in" period setting to the training of stateful RLModules. (#49680)
- Add support for
- Callbacks API: Add support for individual lambda-style callbacks. (#49511)
- Other enhancements: #49687, #49714, #49693, #49497, #49800, #49098
📖 Documentation:
- New example scripts:
- New/rewritten html pages:
- Rewrite checkpointing page. (#49504)
- New scaling guide. (#49528)
- New callbacks page. (#49513)
- Rewrite
RLModule
page. (#49387) - New AlgorithmConfig page and redo
package_ref
page for algo configs. (#49464) - Rewrite offline RL page. (#48818)
- Rewrite “key concepts" rst page. (#49398)
- Rewrite RL environments pages. (#49165, #48542)
- Fixes and enhancements: #49465, #49037, #49304, #49428, #49474, #49399, #49713, #49518
🔨 Fixes:
- Add
on_episode_created
callback to SingleAgentEnvRunner. (#49487) - Fix
train_batch_size_per_learner
problems. (#49715) - Various other fixes: #48540, #49363, #49418, #49191
🏗 Architecture refactoring:
- RLModule: Introduce
Default[algo]RLModule
classes (#49366, #49368) - Remove RLlib dependencies from setup.py; add
ormsgpack
(#49489)
🗑️ Deprecations:
Ray Core and Ray Clusters
Ray Core
💫 Enhancements:
- Add
task_name
,task_function_name
andactor_name
in Structured Logging (#48703) - Support redis/valkey authentication with username (#48225)
- Add v6e TPU Head Resource Autoscaling Support (#48201)
- compiled graphs: Support all driver and actor read combinations (#48963)
- compiled graphs: Add ascii based CG visualization (#48315)
- compiled graphs: Add ray[cg] pip install option (#49220)
- Allow uv cache at installation (#49176)
- Support != Filter in GCS for Task State API (#48983)
- compiled graphs: Add CPU-based NCCL communicator for development (#48440)
- Support gcs and raylet log rotation (#48952)
- compiled graphs: Support
nsight.nvtx
profiling (#49392)
🔨 Fixes:
- autoscaler: Health check logs are not visible in the autoscaler container's stdout (#48905)
- Only publish
WORKER_OBJECT_EVICTION
when the object is out of scope or manually freed (#47990) - autoscaler: Autoscaler doesn't scale up correctly when the KubeRay RayCluster is not in the goal state (#48909)
- autoscaler: Fix incorrectly terminating nodes misclassified as idle in autoscaler v1 (#48519)
- compiled graphs: Fix the missing dependencies when num_returns is used (#49118)
- autoscaler: Fuse scaling requests together to avoid overloading the Kubernetes API server (#49150)
- Fix bug to support S3 pre-signed url for
.whl
file (#48560) - Fix data race on gRPC client context (#49475)
- Make sure draining node is not selected for scheduling (#49517)
Ray Clusters
💫 Enhancements:
- Azure: Enable accelerated networking as a flag in azure vms (#47988)
📖 Documentation:
- Kuberay: Logging: Add Fluent Bit
DaemonSet
and Grafana Loki to "Persist KubeRay Operator Logs" (#48725) - Kuberay: Logging: Specify the Helm chart version in "Persist KubeRay Operator Logs" (#48937)
Dashboard
💫 Enhancements:
- Add instance variable to many default dashboard graphs (#49174)
- Display duration in milliseconds if under 1 second. (#49126)
- Add
RAY_PROMETHEUS_HEADERS
env for carrying additional headers to Prometheus (#49353) - Document about the
RAY_PROMETHEUS_HEADERS
env for carrying additional headers to Prometheus (#49700)
🏗 Architecture refactoring:
- Move
memray
dependency from default to observability (#47763) - Move
StateHead
's methods into free functions. (#49388)
Thanks
@raulchen, @alanwguo, @omatthew98, @xingyu-long, @tlinkin, @yantzu, @alexeykudinkin, @andrewsykim, @win5923, @csy1204, @dayshah, @richardliaw, @stephanie-wang, @gueraf, @rueian, @davidxia, @fscnick, @wingkitlee0, @KPostOffice, @GeneDer, @MengjinYan, @simonsays1980, @pcmoritz, @petern48, @kashiwachen, @pfldy2850, @zcin, @scottjlee, @Akhil-CM, @Jay-ju, @JoshKarpel, @edoakes, @ruisearch42, @gorloffslava, @jimmyxie-figma, @bthananjeyan, @sven1977, @bnorick, @jeffreyjeffreywang, @ravi-dalal, @matthewdeng, @angelinalg, @ivanthewebber, @rkooo567, @srinathk10, @maresb, @gvspraveen, @akyang-anyscale, @mimiliaogo, @bveeramani, @ryanaoleary, @kevin85421, @richardsliu, @hartikainen, @coltwood93, @mattip, @Superskyyy, @justinvyu, @hongpeng-guo, @ArturNiederfahrenhorst, @jecsand838, @Bye-legumes, @hcc429, @WeichenXu123, @martinbomio, @HollowMan6, @MortalHappiness, @dentiny, @zhe-thoughts, @anyadontfly, @smanolloff, @richo-anyscale, @khluu, @xushiyan, @rynewang, @japneet-anyscale, @jjyao, @sumanthratna, @saihaj, @aslonnie
Many thanks to all those who contributed to this release!