github ray-project/ray ray-2.43.0
Ray-2.43.0

one day ago

Highlights

  • This release features new modules in Ray Serve and Ray Data for integration with large language models, marking the first step of addressing #50639. Existing Ray Data and Ray Serve have limited support for LLM deployments, where users have to manually configure and manage the underlying LLM engine. In this release, we offer APIs for both batch inference and serving of LLMs within Ray in ray.data.llm and ray.serve.llm. See the below notes for more details.
  • Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the RAY_TRAIN_V2_ENABLED=1 environment variable. See the migration guide for more information.
  • A new integration with uv run that allows easily specifying Python dependencies for both driver and workers in a consistent way and enables quick iterations for development of Ray applications (#50160, 50462), check out our blog post

Ray Libraries

Ray Data

🎉 New Features:

  • Ray Data LLM: We are introducing a new module in Ray Data for batch inference with LLMs. It offers a new Processor abstraction that interoperates with existing Ray Data pipelines. This abstraction can be configured two ways:
    • Using the vLLMEngineProcessorConfig, which configures vLLM to load model replicas for high throughput model inference
    • Using the HttpRequestProcessorConfig, which sends HTTP requests to an OpenAI-compatible endpoint for inference.
    • Documentation for these features can be found here.
  • Implement accurate memory accounting for UnionOperator (#50436)
  • Implement accurate memory accounting for all-to-all operations (#50290)

💫 Enhancements:

  • Support class constructor args for filter() (#50245)
  • Persist ParquetDatasource metadata. (#50332)
  • Rebasing ShufflingBatcher onto try_combine_chunked_columns (#50296)
  • Improve warning message if required dependency isn't installed (#50464)
  • Move data-related test logic out of core tests directory (#50482)
  • Pass executor as an argument to ExecutionCallback (#50165)
  • Add operator id info to task+actor (#50323)
  • Abstracting common methods, removing duplication in ArrowBlockAccessor, PandasBlockAccessor (#50498)
  • Warn if map UDF is too large (#50611)
  • Replace AggregateFn with AggregateFnV2, cleaning up Aggregation infrastructure (#50585)
  • Simplify Operator.repr (#50620)
  • Adding in TaskDurationStats and on_execution_step callback (#50766)
  • Print Resource Manager stats in release tests (#50801)

🔨 Fixes:

  • Fix invalid escape sequences in grouped_data.py docstrings (#50392)
  • Deflake test_map_batches_async_generator (#50459)
  • Avoid memory leak with pyarrow.infer_type on datetime arrays (#50403)
  • Fix parquet partition cols to support tensors types (#50591)
  • Fixing aggregation protocol to be appropriately associative (#50757)

📖 Documentation:

  • Remove "Stable Diffusion Batch Prediction with Ray Data" example (#50460)

Ray Train

🎉 New Features:

  • Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the RAY_TRAIN_V2_ENABLED=1 environment variable. See the migration guide for more information.

💫 Enhancements:

  • Add a training ingest benchmark release test (#50019, #50299) with a fault tolerance variant (#50399)
  • Add telemetry for Trainer usage in V2 (#50321)
  • Add pydantic as a ray[train] extra install (#46682)
  • Add state tracking to train v2 to make run status, run attempts, and training worker metadata observable (#50515)

🔨 Fixes:

  • Increase doc test parallelism (#50326)
  • Disable TF test for py312 (#50382)
  • Increase test timeout to deflake (#50796)

📖 Documentation:

  • Add missing xgboost pip install in example (#50232)

🏗 Architecture refactoring:

Ray Tune

🔨 Fixes:

  • Fix worker node failure test (#50109)

📖 Documentation:

  • Update all doc examples off of ray.train imports (#50458)
  • Update all ray/tune/examples off of ray.train imports (#50435)
  • Fix typos in persistent storage guide (#50127)
  • Remove Binder notebook links in Ray Tune docs (#50621)

🏗 Architecture refactoring:

  • Update RLlib to use ray.tune imports instead of ray.air and ray.train (#49895)

Ray Serve

🎉 New Features:

  • Ray Serve LLM: We are introducing a new module in Ray Serve to easily integrate open source LLMs in your Ray Serve deployment. This opens up a powerful capability of composing complex applications with multiple LLMs, which is a use case in emerging applications like agentic workflows. Ray Serve LLM offers a couple core components, including:
    • VLLMService: A prebuilt deployment that offers a full-featured vLLM engine integration, with support for features such as LoRA multiplexing and multimodal language models.
    • LLMRouter: An out-of-the-box OpenAI compatible model router that can route across multiple LLM deployments.
    • Documentation can be found at https://docs.ray.io/en/releases-2.43.0/serve/llm/overview.html

💫 Enhancements:

  • Add required_resources to REST API (#50058)

🔨 Fixes:

  • Fix batched requests hanging after cancellation (#50054)
  • Properly propagate backpressure error (#50311)

RLlib

🎉 New Features:

  • Added env vectorization support for multi-agent (new API stack). (#50437)

💫 Enhancements:

  • APPO/IMPALA various acceleration efforts. Reached 100k ts/sec on Atari benchmark with 400 EnvRunners and 16 (multi-node) GPU Learners: #50760, #50162, #50249, #50353, #50368, #50379, #50440, #50477, #50527, #50528, #50600, #50309
  • Offline RL:
    • Remove all weight synching to eval_env_runner_group from the training steps. (#50057)
    • Enable single-learner/multi-learner GPU training. (#50034)
    • Remove reference to MARWILOfflinePreLearner in OfflinePreLearner docstring. (#50107)
    • Add metrics to multi-agent replay buffers. (#49959)

🔨 Fixes:

  • Fix SPOT preemption tolerance for large AlgorithmConfig: Pass by reference to RolloutWorker (#50688)
  • on_workers/env_runners_recreated callback would be called twice. (#50172)
  • default_resource_request: aggregator actors missing in placement group for local Learner. (#50219, #50475)

📖 Documentation:

  • Docs re-do (new API stack):
    • Rewrite/enhance "getting started" rst page. (#49950)
    • Remove rllib-models.rst and fix broken html links. (#49966, #50126)

Ray Core and Ray Clusters

Ray Core

💫 Enhancements:

  • [Core] Enable users to configure python standard log attributes for structured logging (#49871)
  • [Core] Prestart worker with runtime env (#49994)
  • [compiled graphs] Support experimental_compile(_default_communicator=comm) (#50023)
  • [Core] ray.util.Queue Empty and Full exceptions extend queue.Empty and Full (#50261)
  • [Core] Initial port of Ray to Python 3.13 (#47984)

🔨 Fixes:

  • [Core] Ignore stale ReportWorkerBacklogRequest (#50280)
  • [Core] Fix check failure due to negative available resource (#50517)

Ray Clusters

📖 Documentation:

  • Update the KubeRay docs to v1.3.0.

Ray Dashboard

🎉 New Features:

  • Additional filters for job list page (#50283)

Thanks

Thank you to everyone who contributed to this release! 🥳
@liuxsh9, @justinrmiller, @CheyuWu, @400Ping, @scottsun94, @bveeramani, @bhmiller, @tylerfreckmann, @hefeiyun, @pcmoritz, @matthewdeng, @dentiny, @erictang000, @gvspraveen, @simonsays1980, @aslonnie, @shorbaji, @LeoLiao123, @justinvyu, @israbbani, @zcin, @ruisearch42, @khluu, @kouroshHakha, @sijieamoy, @SergeCroise, @raulchen, @anson627, @bluenote10, @allenyin55, @martinbomio, @rueian, @rynewang, @owenowenisme, @Betula-L, @alexeykudinkin, @crypdick, @jujipotle, @saihaj, @EricWiener, @kevin85421, @MengjinYan, @chris-ray-zhang, @SumanthRH, @chiayi, @comaniac, @angelinalg, @kenchung285, @tanmaychimurkar, @andrewsykim, @MortalHappiness, @sven1977, @richardliaw, @omatthew98, @fscnick, @akyang-anyscale, @cristianjd, @Jay-ju, @spencer-p, @win5923, @wxsms, @stfp, @letaoj, @JDarDagran, @jjyao, @srinathk10, @edoakes, @vincent0426, @dayshah, @davidxia, @DmitriGekhtman, @GeneDer, @HYLcool, @gameofby, @can-anyscale, @ryanaoleary, @eddyxu

Don't miss a new ray release

NewReleases is sending notifications on new releases.