Ray Libraries
Ray Data
๐ New Features:
- Added read_hudi (#46273)
๐ซ Enhancements:
- Improved performance of DelegatingBlockBuilder (#48509)
- Improved memory accounting of pandas blocks (#46939)
๐จ Fixes:
- Fixed bug where you canโt specify a schema with write_parquet (#48630)
- Fixed bug where to_pandas errors if your dataset contains Arrow and pandas blocks (#48583)
- Fixed bug where map_groups doesnโt work with pandas data (#48287)
- Fixed bug where write_parquet errors if your data contains nullable fields (#48478)
- Fixed bug where โIteration Blocked Timeโ charts looks incorrect (#48618)
- Fixed bug where unique fails with null values (#48750)
- Fixed bug where โRows Outputtedโ is 0 in the Data dashboard (#48745)
- Fixed bug where methods like drop_columns cause spilling (#48140)
- Fixed bug where async map tasks hang (#48861)
๐๏ธ Deprecations:
- Deprecated read_parquet_bulk #48691
- Deprecated iter_tf_batches #48693
- Deprecated meta_provider parameter of read functions (#48690)
- Deprecated to_torch (#48692)
Ray Train
๐จ Fixes:
- Fix StartTracebackWithWorkerRank serialization (#48548)
๐ Documentation:
- Add example for fine-tuning Llama3.1 with AWS Trainium (#48768)
Ray Tune
๐จ Fixes:
- Remove the
clear_checkpoint
function during Trial restoration error handling. (#48532)
Ray Serve
๐ New Features:
- Initial version of local_testing_mode (#48477)
๐ซ Enhancements:
- Handle multiple changed objects per LongPollHost.listen_for_change RPC (#48803)
- Add more nuanced checks for http proxy status errors (#47896)
- Improve replica access log messages to include HTTP status info and better resemble standard log format (#48819)
- Propagate replica constructor error to deployment status message and print num retries left (#48531)
๐จ Fixes:
- Pending requests that are cancelled before they were assigned to a replica now also return a serve.RequestCancelledError (#48496)
RLlib
๐ซ Enhancements:
- Release test enhancements. (#45803, #48681)
- Make opencv-python-headless default over opencv-python (#48776)
- Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). (#48702)
๐จ Fixes:
- Fix torch scheduler stepping and reporting. (#48125)
- Fix accumulation of results over n training_step calls within same iteration (new API stack). (#48136)
- Various other fixes: #48563, #48314, #48698, #48869.
๐ Documentation:
- Upgrade examples script overview page (new API stack). (#48526)
- Enable RLlib + Serve example in CI and translate to new API stack. (#48687)
๐ Architecture refactoring:
- Switch new API stack on by default, APPO, IMPALA, BC, MARWIL, and CQL. (#48516, #48599)
- Various APPO enhancements (new API stack): Circular buffer (#48798), minor loss math fixes (#48800), target network update logic (#48802), smaller cleanups (#48844).
- Remove
rllib_contrib
from repo. (#48565)
Ray Core and Ray Clusters
Ray Core
๐ New Features:
- [Core] uv runtime env support (#48479, #48486, #48611, #48619, #48632, #48634, #48637, #48670, #48731)
- [Core] GCS FT with redis sentinel (#47335)
๐ซ Enhancements:
- [CompiledGraphs] Refine schedule visualization (#48594)
๐จ Fixes:
- [CompiledGraphs] Don't persist input_nodes in _CollectiveOperation to avoid wrong understanding about DAGs (#48463)
- [Core] Fix Ascend NPU discovery to support 8+ cards per node (#48543)
- [Core] Make Placement Group Wildcard and Indexed Resource Assignments Consistent (#48088)
- [Core] Stop the GRPC server before Shut down the Object Store (#48572)
Ray Clusters
๐จ Fixes:
- [KubeRay]: Fix ConnectionError on Autoscaler CR lookups in K8s clusters with custom DNS for Kubernetes API. (#48541)
Dashboard
๐ซ Enhancements:
- Add global UTC timezone button in navbar with local storage (#48510)
- Add memory graphs optimized for OOM debugging (#48530)
- Improve tasks/actors metric naming and add graph for running tasks (#48528)
add actor pid to dashboard (#48791)
๐จ Fixes:
- Fix Placement Group Table table cells overflow (#47323)
- Fix Rows Outputted being zero on Ray Data Dashboard (#48745)
- fix confusing dataset operator name (#48805)
Thanks
Thanks to all those who contributed to this release!
@rynewang, @rickyyx, @bveeramani, @marwan116, @simonsays1980, @dayshah, @dentiny, @KepingYan, @mimiliaogo, @kevin85421, @SeaOfOcean, @stephanie-wang, @mohitjain2504, @azayz, @xushiyan, @richardliaw, @can-anyscale, @xingyu-long, @kanwang, @aslonnie, @MortalHappiness, @jjyao, @SumanthRH, @matthewdeng, @alexeykudinkin, @sven1977, @raulchen, @andrewsykim, @zcin, @nadongjun, @hongpeng-guo, @miguelteixeiraa, @saihaj, @khluu, @ArturNiederfahrenhorst, @ryanaoleary, @ltbringer, @pcmoritz, @JoshKarpel, @akyang-anyscale, @frances720, @BeingGod, @edoakes, @Bye-legumes, @Superskyyy, @liuxsh9, @MengjinYan, @ruisearch42, @scottjlee, @angelinalg