github SeldonIO/seldon-core v2.10.0

11 hours ago

Overview

Core 2.10.0 is a release with significant new features, focused on scalability, usability and bugfixes.

Upgrading from previous Core 2 versions

  • All CRD changes maintain backward compatibility with existing CRs
  • We introduce new Core 2 scaling configuration options in SeldonConfig (config.ScalingConfig.*), with a wider goal of centralising Core 2 configuration and allowing for configuration changes after the Core 2 cluster is deployed. To ensure a smooth transition, some of the configuration options will only take effect starting from the next releases, but end-users are encouraged to set them to the desired values before upgrading to the next release (2.11).

Upgrading when using helm is seamless, with existing helm values being used to fill in new configuration options. If not using helm, previous SeldonConfig CRs remain valid, but restrictive defaults will be used for the scaling configuration. One parameter in particular, maxShardCountMultiplier [docs] will need to be set in order to take advantage of the new pipeline scalability features. This parameter can be changed and the effects of its value will be propagated to all components that use the config.

New features

  • Pipeline scalability features, with all pipeline components (dataflow-engine, pipelinegateway, modelgateway) now being horizontally scalable, and not limited in terms of replicas to the number of kafka partitions per topic. [docs]
  • Integration with Kafka Schema Registry, providing visibility into the schema contracts for Models and Pipelines input and output topics when deploying Pipelines. This connects Core 2 pipelines to the broader Kafka ecosystem. This feature facillitates the integration with products like Kafka Connect and ksqlDB to build custom solutions for data streaming, processing, and logging tailored to your machine learning workflows. [docs]
  • New translation layer converting OpenAI API REST to and from OIP. This allows for the usage of standard OpenAI libraries & clients when communicating with LLM models deployed via the Seldon LLM module.

Experimental features (early preview, not production-ready)

  • Configuration of inference servers as k8s Deployments rather that StatefulSets

Usability improvements

  • Pipeline control plane now more robust to disruptions, with fine-grained status updates propagated towards Pipeline CR statuses.
  • Pipeline data plane faster recovery after component restarts
  • Eliminated sources of downtime during inference server replicas restarts, together with more graceful shutdowns across all components
  • All Core 2 components now have associated k8s lifecycle probes

Bugfixes

  • Fix bug affecting availability on inference server start-up after a restart
  • Fix model native autoscaling remaining active after being disabled via config (once it was once activated). Model native autoscaling (based on lag) is disabled as a whole in 2.10, until we implement wider fixes. Until then, we strongly recommend enabling server autoscaling and controlling model autoscaling via HPA or KEDA.
  • Fix issues with the rclone container becoming unresponsive after long periods of uptime.

Kudos:

We would like to recognise the significant contributions made by @RobertSamoilescu to Core 2

With contributions from @RobertSamoilescu, @domsolutions, @MiguelAAe , @lc525, @cherrymu, @paulb-seldon, @Rajakavitha1 , @monica-seldon,


Changelog

Dates are displayed in UTC. Generated by auto-changelog.

v2.10.0

8 October 2025

  • fix(probes): improve timing of k8s lifecycle probes #6861
  • fix(docs): update pipeline scalability docs with maxShardCountMultiplier info #6859
  • fix(scheduler): Add scaling config and upgrade paths #6833
  • fix(scheduler): typo when setting pipeline-gw status for a pipeline #6856
  • fix(scheduler): Allow for pipelines with some of their statuses set to PipelineStatusUnknown #6853
  • Removed redunant sorting by trigger for topology #6852
  • fix(scheduler, dataflow): pipeline loading/unloading on pipeline-gw and dataflow engine topology #6849
  • fix(pipeline-gw): temp preStop hook #6841
  • docs(pipeline): pipeline scalability docs #6838
  • enable Lychee for docs on v2 #6786
  • fix(modelgateway): Number of partitions retrieval #6828
  • fix(agent/rclone): rclone OOM #6830
  • fix proto imports #6836
  • fix unable to set 0 replicas #6834
  • feat(dataflow): added fullJitterBackoff ack for pipeline status #6831
  • feat(modelgw): modelgw status update #6799
  • fix(agent): force disable auto-scaling of models on agent/scheduler #6814
  • fix(operator): SubscribeControlPlane failure blocking loading other CRs #6824
  • feat(pipelinegw): pipeline status in pipelinegw #6767
  • remove import of undefined func #6822
  • feat(tests): pipeline scalability tests #6813
  • fix(docs): spelling and missing namespace attribute #6815
  • chore(helm): 3GB default dataflow memory req/limit #6811
  • feat(dataflow): pipeline status update #6757
  • fix blocked draining agents when waiting for model to be loaded which can't be loaded as no replicas available #6794
  • fix: repeateded identical subscription reqs sent to kafka #6807
  • fix(model-gw): graceful shutdown of kafka consumers #6801
  • fix(docs): Schema Registry Environment Configuration #6804
  • docs(schema-registry): Installation guide #6785
  • feat(kafka): Schema registry #6689
  • feat: Schema Registry in Ansible configuration #6679
  • fix(dataflow): deprecated use of kafka streams Transformer classes #6795
  • fix(controller): Server scaling spec #6613
  • fix(operator): failed update status #6789
  • Added watches on models #6788
  • fix(helm): added changes required to configure annotations for controller deployment #6748
  • feat(dataflow-engine): health probes #6766
  • feat(translator): OpenAI API REST translation to OIP #6619
  • feat(scheduler): health probes #6756
  • fix(envoy): corrupt envoy yaml and no ALPN config #6763
  • gRPC graceful shutdown #6760
  • feat(model-gw): health probes #6745
  • feat(Scheduler): Deal with model replicas being set to 0 #6557
  • fix(agent): enable gRPC keep-alive #6621
  • feat(pipeline-gw): health probes #6728
  • fix(scheduler): race conditions #6747
  • feat(dataflow): pipeline parallel loading #6746
  • chore(tests): enable race detector #6614
  • fix(agent): model availability during inference pod deletion #6636
  • fix(pipeline-gw): re-publish in-flight reqs due to partition revoke #6695
  • fix(pipeline-gw): failed incoming reqs when partitions not available #6690
  • ci(lint): lint PR title via bash #6691
  • fix(Scheduler): No dataflow engines available for terminated pipelines #6519
  • feat: pipeline loadbalancer #6675
  • Fix resource allocation link #6673
  • fix: typo pipeline output #6647
  • feat: statefulsets to deployments for servers #6445
  • docs: Update test-installation.md #6629
  • feat(modelgw): modelgw scalability #6538
  • feat(pipelinegw): pipelinegw scalability to number of partitions #6600
  • feat(dataflow): dataflow scalability #6498
  • ci(Pipeline): Enable Go module caching #6618
  • Update Changelog #6605
  • Generating changelog for v2.10.0 ac92c10
  • GitBook: No commit message 9fe5525
  • Setting version for helm charts 1745e81
  • GitBook: No commit message a490853
  • Setting version for yaml manifests 17cbf19

Don't miss a new seldon-core release

NewReleases is sending notifications on new releases.