Overview
Core 2.10.2 is a patch release, fixing several long-standing issues which became more visible in the 2.10.x releases.
- Pipelines failed to create or delete, they would remain in that state, even if the cluster then became healthy
- Potential gRPC stream blocking between agent/scheduler causing models to not load
- The operator blocked from reconciling custom resources, thus preventing administrators from making changes
- MLServer parallel workers not being set
- MLServer access log flooding
Bugfix details:
If a Kafka cluster was unhealthy and the components such as model-gateway, were unable to connect to a broker, a pipeline would be in a failed state. The scheduler would be notified of this but then once the Kafka cluster was healthy, would not retry to create the pipeline on the necessary services. This fix will now attempt to retry creating/deleting pipelines on a configured periodic basis. It is controlled by 2 environment variables on the scheduler:
RETRY_CREATING_FAILED_PIPELINES_TICKdefault to60sis how often theschedulerwill attempt to create pipelines which failed to createRETRY_DELETING_FAILED_PIPELINES_TICKdefault to60sis how often theschedulerwill attempt to delete pipelines which failed to deleteMAX_RETRY_FAILED_PIPELINESdefault to10is max retries the scheduler will attempt
gRPC streams were not being properly handled when attempting to send data. The Go context was not being checked to see if the receiver had closed the stream. This led to blocking issues where the scheduler would attempt to load a model on an agent, but then the agent had closed the stream. This prevented the scheduler from taking further model loading actions. We also noticed this same pattern in several other places. We now verify the stream is still active before attempting to send.
The operator sometimes has to re-attempt sending cluster state to the scheduler due to any number of reasons (bad connectivity, scheduler restarting due to failed liveness check etc). When this happens, if it happens repeatedly, the operator ends up in a state where it may retry sending for many hours. This means the operator is blocked from reconciling any resource. We've now addressed this by setting a max retry limit on the exponential backoff retry settings, while also enforcing timeouts on any network call from the operator.
The helm charts were setting the wrong environment variable when configuring the number of parallel workers on MLServer. This would have caused latency and reduced throughput for any customers who had set this number > 1 (default is 1). Helm charts now use the correct variable MLSERVER_PARALLEL_WORKERS and defaults to 1. Note this is the correct variable to use for MLServer versions >= 1.1.0. If you are using a version less than this, then you should manually set the variable MLSERVER_MODEL_PARALLEL_WORKERS as helm charts no longer support this.
During inference MLServer would log every inference request. Under high load, this could cause latency and reduced throughput, as some cloud providers throttle disk IO operations. This is now turned off by default and can be configured via MLSERVER_DEBUG on the ServerConfig custom resource under MLServer or by using helm value serverConfig.mlserver.debug.
To aid with debugging issues within the scheduler, we added pprof. This is turned off by default, but can be configured via environment variables:
ENABLE_PPROFdefaultfalsePPROF_PORTdefault6060is the HTTP port to access the performance dumps. Note it listens onlocalhostso can only be accessed via port-forwarding.PPROF_BLOCK_RATEdefault0controls how frequently blocking events (mutex contention, channel operations) are sampled. 1 captures every blocking eventPPROF_MUTEX_RATEdefault0controls how frequently mutex contentions events are sampled. 1 captures every mutex contention.
Upgrading from previous Core 2 versions
No CRD changes are introduced in this patch release, but if upgrading from a version previous to 2.10.0, you should first read the 2.10.0 release notes. If you wish to set the number of parallel workers on MLServer > 1 you will need to set serverConfig.mlserver.parallel_workers in your helm values (only if you're running MLServer > 1.1.0).
Changelog
All notable changes to this project will be documented in this file. Dates are displayed in UTC.
Generated by auto-changelog.
v2.10.2
18 December 2025
- fix(mlserver): turn off logging request logs by default
#7042 - feat(e2e-tests): added inference of pipeline
#7029 - feat(e2e-tests): model experiment test
#7023 - test model over-commit
#7021 - fix model infer
#7018 - feat(e2e-tests): pipeline tests
#7010 - feat(e2e-tests): server setup
#7012 - feat(e2e-tests): model deployment and inference of test models from python tests to bdd
#7007 - feat(e2e-test): test for model deletion steps
#7004 - refactor(e2e-tests): names and logger
#6999 - config for tests
#6994 - feat(e2e-test): gen client and deletion of resources
#6995 - feat(all): Add release version to binaries
#6912 - fix(godog): go mod module name
#6991 - feat(e2e-test): test for custom model spec & inference via HTTP/gRPC
#6979 - feat(operator): auto generated custom k8s client
#6984 - fix(helm): mlserver env var parallel workers
#6974 - Exp bdd tests
#6965 - fix(scheduler/model-gw): failed pipelines never retried
#6917 - docs(tracing.md): Tracing Page Update
#6956 - Update observability.md
#6958 - Update README.md
#6948 - docs(multiplepages): fixed broken links
#6914 - docs (Update pandasquery): fixed the broken link
#6945 - fix prometheus installation
#6931 - Add files via upload
#6934 - Update README.md
#6932 - Update open-inference-protocol-v2.openapi.yaml
#6925 - feat(agent): improve error logging
#6918 - Update open-inference-protocol-v2.openapi.yaml
#6923 - Add files via upload
#6920 - docs(kubernetes examples): updated the curl commands part2
#6913 - docs(kubernetes examples): updated the curl commands
#6905 - fix(agent/scheduler/model-gw/pipeline-gw/operator): closing gRPC stream
#6902 - fix(operator): Blocking gRPC calls
#6898 - feat(scheduler): optionally enable pprof
#6899 - Generating changelog for v2.10.2
70dac2b - GitBook: No commit message
c9bba8a - Setting version for helm charts
fd3f01d - Setting version for yaml manifests
5d98bca