SeldonIO/seldon-core v2.10.2 on GitHub

Overview

Core 2.10.2 is a patch release, fixing several long-standing issues which became more visible in the 2.10.x releases.

Pipelines failed to create or delete, they would remain in that state, even if the cluster then became healthy
Potential gRPC stream blocking between agent/scheduler causing models to not load
The operator blocked from reconciling custom resources, thus preventing administrators from making changes
MLServer parallel workers not being set
MLServer access log flooding

Bugfix details:

If a Kafka cluster was unhealthy and the components such as model-gateway, were unable to connect to a broker, a pipeline would be in a failed state. The scheduler would be notified of this but then once the Kafka cluster was healthy, would not retry to create the pipeline on the necessary services. This fix will now attempt to retry creating/deleting pipelines on a configured periodic basis. It is controlled by 2 environment variables on the scheduler:

RETRY_CREATING_FAILED_PIPELINES_TICK default to 60s is how often the scheduler will attempt to create pipelines which failed to create
RETRY_DELETING_FAILED_PIPELINES_TICK default to 60s is how often the scheduler will attempt to delete pipelines which failed to delete
MAX_RETRY_FAILED_PIPELINES default to 10 is max retries the scheduler will attempt

gRPC streams were not being properly handled when attempting to send data. The Go context was not being checked to see if the receiver had closed the stream. This led to blocking issues where the scheduler would attempt to load a model on an agent, but then the agent had closed the stream. This prevented the scheduler from taking further model loading actions. We also noticed this same pattern in several other places. We now verify the stream is still active before attempting to send.

The operator sometimes has to re-attempt sending cluster state to the scheduler due to any number of reasons (bad connectivity, scheduler restarting due to failed liveness check etc). When this happens, if it happens repeatedly, the operator ends up in a state where it may retry sending for many hours. This means the operator is blocked from reconciling any resource. We've now addressed this by setting a max retry limit on the exponential backoff retry settings, while also enforcing timeouts on any network call from the operator.

The helm charts were setting the wrong environment variable when configuring the number of parallel workers on MLServer. This would have caused latency and reduced throughput for any customers who had set this number > 1 (default is 1). Helm charts now use the correct variable MLSERVER_PARALLEL_WORKERS and defaults to 1. Note this is the correct variable to use for MLServer versions >= 1.1.0. If you are using a version less than this, then you should manually set the variable MLSERVER_MODEL_PARALLEL_WORKERS as helm charts no longer support this.

During inference MLServer would log every inference request. Under high load, this could cause latency and reduced throughput, as some cloud providers throttle disk IO operations. This is now turned off by default and can be configured via MLSERVER_DEBUG on the ServerConfig custom resource under MLServer or by using helm value serverConfig.mlserver.debug.

To aid with debugging issues within the scheduler, we added pprof. This is turned off by default, but can be configured via environment variables:

ENABLE_PPROF default false
PPROF_PORT default 6060 is the HTTP port to access the performance dumps. Note it listens on localhost so can only be accessed via port-forwarding.
PPROF_BLOCK_RATE default 0 controls how frequently blocking events (mutex contention, channel operations) are sampled. 1 captures every blocking event
PPROF_MUTEX_RATE default 0 controls how frequently mutex contentions events are sampled. 1 captures every mutex contention.

Upgrading from previous Core 2 versions
No CRD changes are introduced in this patch release, but if upgrading from a version previous to 2.10.0, you should first read the 2.10.0 release notes. If you wish to set the number of parallel workers on MLServer > 1 you will need to set serverConfig.mlserver.parallel_workers in your helm values (only if you're running MLServer > 1.1.0).

Changelog

All notable changes to this project will be documented in this file. Dates are displayed in UTC.

Generated by auto-changelog.

v2.10.2

18 December 2025

fix(mlserver): turn off logging request logs by default #7042
feat(e2e-tests): added inference of pipeline #7029
feat(e2e-tests): model experiment test #7023
test model over-commit #7021
fix model infer #7018
feat(e2e-tests): pipeline tests #7010
feat(e2e-tests): server setup #7012
feat(e2e-tests): model deployment and inference of test models from python tests to bdd #7007
feat(e2e-test): test for model deletion steps #7004
refactor(e2e-tests): names and logger #6999
config for tests #6994
feat(e2e-test): gen client and deletion of resources #6995
feat(all): Add release version to binaries #6912
fix(godog): go mod module name #6991
feat(e2e-test): test for custom model spec & inference via HTTP/gRPC #6979
feat(operator): auto generated custom k8s client #6984
fix(helm): mlserver env var parallel workers #6974
Exp bdd tests #6965
fix(scheduler/model-gw): failed pipelines never retried #6917
docs(tracing.md): Tracing Page Update #6956
Update observability.md #6958
Update README.md #6948
docs(multiplepages): fixed broken links #6914
docs (Update pandasquery): fixed the broken link #6945
fix prometheus installation #6931
Add files via upload #6934
Update README.md #6932
Update open-inference-protocol-v2.openapi.yaml #6925
feat(agent): improve error logging #6918
Update open-inference-protocol-v2.openapi.yaml #6923
Add files via upload #6920
docs(kubernetes examples): updated the curl commands part2 #6913
docs(kubernetes examples): updated the curl commands #6905
fix(agent/scheduler/model-gw/pipeline-gw/operator): closing gRPC stream #6902
fix(operator): Blocking gRPC calls #6898
feat(scheduler): optionally enable pprof #6899
Generating changelog for v2.10.2 70dac2b
GitBook: No commit message c9bba8a
Setting version for helm charts fd3f01d
Setting version for yaml manifests 5d98bca