github ray-project/ray ray-2.47.0
Ray-2.47.0

latest release: ray-2.47.1
one month ago

Release Highlights

  • Prefill disaggregation is now supported in initial support in Ray Serve LLM (#53092). This is critical for production LLM serving use cases.
  • Ray Data features a variety of performance improvements (locality-based scheduling, non-blocking execution) as well as improvements to observability, preprocessors, and other stability fixes.
  • Ray Serve now features custom request routing algorithms, which is critical for high throughput traffic for large model use cases.

Ray Libraries

Ray Data

πŸŽ‰ New Features:

  • Add save modes support to file data sinks (#52900)
  • Added flattening capability to the Concatenator preprocessor to support output vectorization use cases (#53378)

πŸ’« Enhancements:

  • Re-enable Actor locality-based scheduling. This PR also improves algorithms for ranking the locations for the bundle. (#52861)
  • Disable blocking pipeline by default until Actor Pool fully scales up to min actors (#52754)
  • Progress bar and dashboard improvements to show name of partial functions properly(#52280)

πŸ”¨ Fixes:

  • Make Ray Data from_torch respect Dataset len (#52804)
  • Fixing flaky aggregation test (#53383)
  • Fix race condition bug in fault tolerance by disabling on_exit hook (#53249)
  • Fix move_tensors_to_device utility for the list/tuple[tensor] case (#53109)
  • Fix ActorPool scaling to avoid scaling down when the input queue is empty (#53009)
  • Fix internal queues accounting for all Operators w/ an internal queue (#52806)
  • Fix backpressure for FileBasedDatasource. This fixes potential OOMs for workloads using FileBasedDatasources (#52852)

πŸ“– Documentation:

  • Fix working code snippets (#52748)
  • Improve AggregateFnV2 docstrings and examples (#52911)
  • Improved documentation for vectorizers and API visibility in Data (#52456)

Ray Train

πŸŽ‰ New Features:

  • Added support for configuring Ray Train worker actor runtime environments. (#52421)
  • Included Grafana panel data in Ray Train export for improved monitoring. (#53072)
  • Introduced a structured logging environment variable to standardize log formats. (#52952)
  • Added metrics for TrainControllerState to enhance observability. (#52805)

πŸ’« Enhancements:

  • Logging of controller state transitions to aid in debugging and analysis. (#53344)
  • Improved handling of Noop scaling decisions for smoother scaling logic. (#53180)

πŸ”¨ Fixes:

  • Improved move_tensors_to_device utility to correctly handle list / tuple of tensors. (#53109)
  • Fixed GPU transfer support for non-contiguous tensors. (#52548)
  • Increased timeout in test_torch_device_manager to reduce flakiness. (#52917)

πŸ“– Documentation:

  • Added a note about PyTorch DataLoader’s multiprocessing and forkserver usage. (#52924)
  • Fixed various docstring format and indentation issues. (#52855, #52878)
  • Removed unused "configuration-overview" documentation page. (#52912)
  • General typo corrections. (#53048)

πŸ— Architecture refactoring:

  • Deduplicated ML doctest runners in CI for efficiency. (#53157)
  • Converted isort configuration to Ruff for consistency. (#52869)
  • Removed unused PARALLEL_CI blocks and combined imports. (#53087, #52742)

Ray Tune

πŸ’« Enhancements:

  • Updated test_train_v2_integration to use the correct RunConfig. (#52882)

πŸ“– Documentation:

  • Replaced session.report with tune.report and corrected import paths. (#52801)
  • Removed outdated graphics cards reference in docs. (#52922)
  • Fixed various docstring format issues. (#52879)

Ray Serve

πŸŽ‰ New Features:

  • Added support for implementing custom request routing algorithms. (#53251)
  • Introduced an environment variable to prioritize custom resources during deployment scheduling. (#51978)

πŸ’« Enhancements:

  • The ingress API now accepts a builder function in addition to an ASGI app object. (#52892)

πŸ”¨ Fixes:

  • Fixed runtime_env validation for py_modules. (#53186)
  • Disallowed special characters in Serve deployment and application names. (#52702)
  • Added a descriptive error message when a deployment name is not found. (#45181)

πŸ“– Documentation:

  • Updated the guide on serving models with Triton Server in Ray Serve.
  • Added documentation for custom request routing algorithms.

Ray Serve/Data LLM

πŸŽ‰ New Features:

  • Added initial support for prefill decode disaggregation (#53092)
  • Expose vLLM Metrics to serve.llm API (#52719)
  • Embedding API (#52229)

πŸ’« Enhancements:

  • Allow setting name_prefix in build_llm_deployment (#53316)
  • Minor bug fix for 53144: stop tokens cannot be null (#53288)
  • Add missing repetition_penalty vLLM sampling parameter (#53222)
  • Mitigate the serve.llm streaming overhead by properly batching stream chunks (#52766)
  • Fix test_batch_vllm leaking resources by using larger wait_for_min_actors_s

πŸ”¨ Fixes:

  • LLMRouter.check_health() should check LLMServer.check_health() (#53358)
  • Fix runtime passthrough and auto-executor class selection (#53253)
  • Update check_health return type (#53114)
  • Bug fix for duplication of <bos> token (#52853)
  • In stream batching, first part of the stream was always consumed and not streamed back from the router (#52848)

RLlib

πŸŽ‰ New Features:

  • Add GPU inference to offline evaluation. (#52718)

πŸ’« Enhancements:

  • Do-over of examples for connector pipelines. (#52604)
  • Cleanup of meta learning classes and examples. (#52680)

πŸ”¨ Fixes:

  • Fixed weight synching in offline evaluation. (#52757)
  • Fixed bug in split_and_zero_pad utility function (related to complex structures vs simple values or np.arrays). (#52818)

Ray Core

πŸ’« Enhancements:

  • uv run integration is now enabled by default, so you don't need to set the RAY_RUNTIME_ENV_HOOK any more (#53060)
  • Record gcs process metrics (#53171)

πŸ”¨ Fixes:

  • Improvements for using RuntimeEnv in the Job Submission API. (#52704)
  • Close unused pipe file descriptor of child processes of Raylet (#52700)
  • Fix race condition when canceling task that hasn't started yet (#52703)
  • Implement a thread pool and call the CPython API on all threads within the same concurrency group (#52575)
  • cgraph: Fix execution schedules with collective operations (#53007)
  • cgraph: Fix scalar tensor serialization edge case with serialize_to_numpy_or_scalar (#53160)
  • Fix the issue where a valid RestartActor rpc is ignored (#53330)
  • Fix reference counter crashes during worker graceful shutdown (#53002)

Dashboard

πŸŽ‰ New Features:

  • train: Add dynolog for on-demand GPU profiling for Torch training (#53191)

πŸ’« Enhancements:

  • Add configurability of 'orgId' param for requesting Grafana dashboards (#53236)

πŸ”¨ Fixes:

  • Fix Grafana dashboards dropdowns for data and train dashboard (#52752)
  • Fix dashboard for daylight savings (#52755)

Ray Container Images

πŸ’« Enhancements:

  • Upgrade h11 (#53361), requests, starlette, jinja2 (#52951), pyopenssl and cryptography (#52941)
  • Generate multi-arch image indexes (#52816)

Docs

πŸŽ‰ New Features:

  • End-to-end example: Entity recognition with LLMs (#52342) - new end-to-end example
  • End-to-end example: xgboost tutorial (#52383)
  • End-to-end tutorial for audio transcription and LLM as judge curation (#53189)

πŸ’« Enhancements:

  • Adds pydoclint to pre-commit (#52974)

Thanks!

Thank you to everyone who contributed to this release!

@NeilGirdhar, @ok-scale, @JiangJiaWei1103, @brandonscript, @eicherseiji, @ktyxx, @MichalPitr, @GeneDer, @rueian, @khluu, @bveeramani, @ArturNiederfahrenhorst, @c8ef, @lk-chen, @alanwguo, @simonsays1980, @codope, @ArthurBook, @kouroshHakha, @Yicheng-Lu-llll, @jujipotle, @aslonnie, @justinvyu, @machichima, @pcmoritz, @saihaj, @wingkitlee0, @omatthew98, @can-anyscale, @nadongjun, @chris-ray-zhang, @dizer-ti, @matthewdeng, @ryanaoleary, @janimo, @crypdick, @srinathk10, @cszhu, @TimothySeah, @iamjustinhsu, @mimiliaogo, @angelinalg, @gvspraveen, @kevin85421, @jjyao, @elliot-barn, @xingyu-long, @LeoLiao123, @thomasdesr, @ishaan-mehta, @noemotiovon, @hipudding, @davidxia, @omahs, @MengjinYan, @dengwxn, @MortalHappiness, @alhparsa, @emmanuel-ferdman, @alexeykudinkin, @KunWuLuan, @dev-goyal, @sven1977, @akyang-anyscale, @GokuMohandas, @raulchen, @abrarsheikh, @edoakes, @JoshKarpel, @bhmiller, @seanlaii, @ruisearch42, @dayshah, @Bye-legumes, @petern48, @richardliaw, @rclough, @israbbani, @jiwq

Don't miss a new ray release

NewReleases is sending notifications on new releases.