github kubeflow/trainer v2.1.0-rc.0

pre-release21 hours ago

This is Kubeflow Trainer v2.1.0-rc.0 pre-release:

kubectl apply --server-side -k "https://github.com/kubeflow/trainer.git/manifests/overlays/manager?ref=v2.1.0-rc.0"
kubectl apply --server-side -k "https://github.com/kubeflow/trainer.git/manifests/overlays/runtimes?ref=v2.1.0-rc.0"

New Features

Distributed AI Data Cache

Volcano Scheduler

  • feat: KEP-2437 - PodGroup Creation for Volcano Scheduler (#2729 by @Doris-xm)
  • feat(docs): KEP-2437-Support Volcano Scheduler in Kubeflow Trainer V2 (#2672 by @Doris-xm)

Runtime Updates

API Updates

  • feat(runtimes): add support for launcher resource allocation in MPI jobs (#2653 by @jskswamy)
  • feat(operator): add config api implementation (#2879 by @kapil27)
  • feat: Add PodTemplateOverrides into TrainJob V2 API (#2882 by @xigang)
  • feat(api): Add PodTemplateOverrides API into TrainJob (#2785 by @xigang)
  • feat(api): Sync TrainJob JobsStatus from JobSet ReplicatedJobsStatus (#2802 by @astefanutti)
  • feat: support imagePullSecrets in TrainJob pod spec overrides (#2806 by @toVersus)
  • feat: support affinity in TrainJob pod spec overrides (#2796 by @toVersus)
  • feat(operator): enforce RFC 1035 validation for TrainJob name (#2767 by @juniemariam)
  • feat: Add schedulingGates to PodSpecOverrides (#2700 by @astefanutti)

Version Upgrade

Bug Fixes

Misc

Don't miss a new trainer release

NewReleases is sending notifications on new releases.