SeldonIO/seldon-core v2.10.0 on GitHub

Overview

Core 2.10.0 is a release with significant new features, focused on scalability, usability and bugfixes.

Upgrading from previous Core 2 versions

All CRD changes maintain backward compatibility with existing CRs
We introduce new Core 2 scaling configuration options in SeldonConfig (config.ScalingConfig.*), with a wider goal of centralising Core 2 configuration and allowing for configuration changes after the Core 2 cluster is deployed. To ensure a smooth transition, some of the configuration options will only take effect starting from the next releases, but end-users are encouraged to set them to the desired values before upgrading to the next release (2.11).

Upgrading when using helm is seamless, with existing helm values being used to fill in new configuration options. If not using helm, previous SeldonConfig CRs remain valid, but restrictive defaults will be used for the scaling configuration. One parameter in particular, maxShardCountMultiplier [docs] will need to be set in order to take advantage of the new pipeline scalability features. This parameter can be changed and the effects of its value will be propagated to all components that use the config.

New features

Pipeline scalability features, with all pipeline components (dataflow-engine, pipelinegateway, modelgateway) now being horizontally scalable, and not limited in terms of replicas to the number of kafka partitions per topic. [docs]
Integration with Kafka Schema Registry, providing visibility into the schema contracts for Models and Pipelines input and output topics when deploying Pipelines. This connects Core 2 pipelines to the broader Kafka ecosystem. This feature facillitates the integration with products like Kafka Connect and ksqlDB to build custom solutions for data streaming, processing, and logging tailored to your machine learning workflows. [docs]
New translation layer converting OpenAI API REST to and from OIP. This allows for the usage of standard OpenAI libraries & clients when communicating with LLM models deployed via the Seldon LLM module.

Experimental features (early preview, not production-ready)

Configuration of inference servers as k8s Deployments rather that StatefulSets

Usability improvements

Pipeline control plane now more robust to disruptions, with fine-grained status updates propagated towards Pipeline CR statuses.
Pipeline data plane faster recovery after component restarts
Eliminated sources of downtime during inference server replicas restarts, together with more graceful shutdowns across all components
All Core 2 components now have associated k8s lifecycle probes

Bugfixes

Fix bug affecting availability on inference server start-up after a restart
Fix model native autoscaling remaining active after being disabled via config (once it was once activated). Model native autoscaling (based on lag) is disabled as a whole in 2.10, until we implement wider fixes. Until then, we strongly recommend enabling server autoscaling and controlling model autoscaling via HPA or KEDA.
Fix issues with the rclone container becoming unresponsive after long periods of uptime.

Kudos:

We would like to recognise the significant contributions made by @RobertSamoilescu to Core 2

With contributions from @RobertSamoilescu, @domsolutions, @MiguelAAe , @lc525, @cherrymu, @paulb-seldon, @Rajakavitha1 , @monica-seldon,

Changelog

Dates are displayed in UTC. Generated by auto-changelog.

v2.10.0

8 October 2025

fix(probes): improve timing of k8s lifecycle probes #6861
fix(docs): update pipeline scalability docs with maxShardCountMultiplier info #6859
fix(scheduler): Add scaling config and upgrade paths #6833
fix(scheduler): typo when setting pipeline-gw status for a pipeline #6856
fix(scheduler): Allow for pipelines with some of their statuses set to PipelineStatusUnknown #6853
Removed redunant sorting by trigger for topology #6852
fix(scheduler, dataflow): pipeline loading/unloading on pipeline-gw and dataflow engine topology #6849
fix(pipeline-gw): temp preStop hook #6841
docs(pipeline): pipeline scalability docs #6838
enable Lychee for docs on v2 #6786
fix(modelgateway): Number of partitions retrieval #6828
fix(agent/rclone): rclone OOM #6830
fix proto imports #6836
fix unable to set 0 replicas #6834
feat(dataflow): added fullJitterBackoff ack for pipeline status #6831
feat(modelgw): modelgw status update #6799
fix(agent): force disable auto-scaling of models on agent/scheduler #6814
fix(operator): SubscribeControlPlane failure blocking loading other CRs #6824
feat(pipelinegw): pipeline status in pipelinegw #6767
remove import of undefined func #6822
feat(tests): pipeline scalability tests #6813
fix(docs): spelling and missing namespace attribute #6815
chore(helm): 3GB default dataflow memory req/limit #6811
feat(dataflow): pipeline status update #6757
fix blocked draining agents when waiting for model to be loaded which can't be loaded as no replicas available #6794
fix: repeateded identical subscription reqs sent to kafka #6807
fix(model-gw): graceful shutdown of kafka consumers #6801
fix(docs): Schema Registry Environment Configuration #6804
docs(schema-registry): Installation guide #6785
feat(kafka): Schema registry #6689
feat: Schema Registry in Ansible configuration #6679
fix(dataflow): deprecated use of kafka streams Transformer classes #6795
fix(controller): Server scaling spec #6613
fix(operator): failed update status #6789
Added watches on models #6788
fix(helm): added changes required to configure annotations for controller deployment #6748
feat(dataflow-engine): health probes #6766
feat(translator): OpenAI API REST translation to OIP #6619
feat(scheduler): health probes #6756
fix(envoy): corrupt envoy yaml and no ALPN config #6763
gRPC graceful shutdown #6760
feat(model-gw): health probes #6745
feat(Scheduler): Deal with model replicas being set to 0 #6557
fix(agent): enable gRPC keep-alive #6621
feat(pipeline-gw): health probes #6728
fix(scheduler): race conditions #6747
feat(dataflow): pipeline parallel loading #6746
chore(tests): enable race detector #6614
fix(agent): model availability during inference pod deletion #6636
fix(pipeline-gw): re-publish in-flight reqs due to partition revoke #6695
fix(pipeline-gw): failed incoming reqs when partitions not available #6690
ci(lint): lint PR title via bash #6691
fix(Scheduler): No dataflow engines available for terminated pipelines #6519
feat: pipeline loadbalancer #6675
Fix resource allocation link #6673
fix: typo pipeline output #6647
feat: statefulsets to deployments for servers #6445
docs: Update test-installation.md #6629
feat(modelgw): modelgw scalability #6538
feat(pipelinegw): pipelinegw scalability to number of partitions #6600
feat(dataflow): dataflow scalability #6498
ci(Pipeline): Enable Go module caching #6618
Update Changelog #6605
Generating changelog for v2.10.0 ac92c10
GitBook: No commit message 9fe5525
Setting version for helm charts 1745e81
GitBook: No commit message a490853
Setting version for yaml manifests 17cbf19