github ray-project/kuberay v1.4.0

latest releases: v1.4.2, ray-operator/v1.4.2, v1.4.1...
2 months ago

Highlights

Enhanced Kubectl Plugin

KubeRay v1.4.0 introduces major improvements to the Kubectl Plugin:

  • Added a new scale command to scale worker groups in a RayCluster.
  • Extended the get command to support listing Ray nodes and worker groups.
  • Improved the create command:
    • Allows overriding default values in config files.
    • Supports additional fields such as Kubernetes labels and annotations, node selectors, ephemeral storage, ray start parameters, TPUs, autoscaler version, and more.

See Using the Kubectl Plugin (beta) and ray-project/ray#53886 (link will be updated to the docs site after merging) for more details.

KubeRay Dashboard (alpha)

Starting from v1.4.0, you can use the open source dashboard UI for KubeRay. This component is still experimental and not considered ready for production, but feedback is welcome.

KubeRay dashboard is a web-based UI that allows you to view and manage KubeRay resources running on your Kubernetes cluster. It's different from the Ray dashboard, which is a part of the Ray cluster itself. The KubeRay dashboard provides a centralized view of all KubeRay resources.

See ray-project/ray#53830 for more information. (The link will be replaced to doc website after the PR being merged)

Integration with kubernetes-sigs/scheduler-plugins

Starting with v1.4.0, KubeRay integrates one more scheduler kubernetes-sigs/scheduler-plugins to support gang scheduling for RayCluster resources. Currently, only single scheduler mode is supported.

See KubeRay integration with scheduler plugins for details.

KubeRay APIServer V2 (alpha)

The new APIServer v2 provides an HTTP proxy interface compatible with the Kubernetes API. It enables users to manage Ray resources using standard Kubernetes clients.

Key features:

  • Full compatibility with Kubernetes OpenAPI Spec and CRDs.
  • Available as a Go library for building custom proxies with pluggable HTTP middleware.

APIServer v1 is now in maintenance mode and will no longer receive new features. v2 is still in alpha. Contributions and feedback are encouraged.

Service Level Indicator (SLI) Metrics

KubeRay now includes SLI metrics to help monitor the state and performance of KubeRay resources.

See KubeRay Metrics Reference for details.

Breaking Changes

Default to Non-Login Bash Shell

Prior to v1.4.0, KubeRay ran most commands using a login shell. Starting from v1.4.0, the default shell is a non-login Bash shell. You can temporarily revert to login shell behavior using the ENABLE_LOGIN_SHELL environment variable, but using login shell is not recommended and this environment variable will be removed in the future release. (#3679)

If you encounter any issues with the new default behavior, please report in #3822 and don't open new issues.

Resource Name Changes and Length Validation

Before v1.4.0, KubeRay silently truncated resource names if they are too long to fit the 63-character limitation for Kubernetes. Starting from v1.4.0, we don't implicitly truncate resource names anymore. Instead, we emit an invalid spec event if the names are too long. (#3083)

We also shortened some of the resource names to loosen the length limitation. The following changes are made:

  • The suffix of headless service for RayCluster changes from headless-worker-svc to headless. (#3101)
  • The suffix of RayCluster name changes from -raycluster-xxxxx to -xxxxx (#3102)
  • The suffix of the head pod for RayCluster changes from -head-xxxxx to -head (#3028)

Updated Autoscaler v2 configuration

Starting from v1.4.0, autoscaler v2 is now configured using:

spec:
  autoscalerOptions:
    version: v2

You should not use the old RAY_enable_autoscaler_v2 environment variable.

See Autoscaler v2 Configuration for guidance.

Changelog

Don't miss a new kuberay release

NewReleases is sending notifications on new releases.