🌈 What's New?
This release introduces two new CRDs ServingRuntimes and ClusterServingRuntimes with the only difference between these two is that one is namespace-scoped and one is cluster-scoped. A ServingRuntime defines the templates for Pods that can serve one or more particular model formats. Each ServingRuntime defines key information such as the container image of the runtime and a list of the model formats that the runtime supports.
In previous versions of KServe, supported predictor formats and container images were defined in a config map in the control plane namespace. The ServingRuntime CRD should allow for improved flexibility and extensibility for defining or customizing runtimes to how you see fit without having to modify any controller code or any resources in the controller namespace.
Several out-of-the-box ClusterServingRuntimes are provided with KServe so that users can continue to use KServe how they did before without having to define the runtimes themselves.
- Add ServingRuntime support by @pvaneck in #1901 @Suresh-Nakkeran in #1926
- Support auto selection for servingRuntimes by @Suresh-Nakkeran in #1948
- Add multiModel field to ServingRuntime spec by @pvaneck in #1983
- Python SDK for servingRuntimes by @Suresh-Nakkeran in #1946
- Support gRPC between transformer and predictor by @xcjason in #1933
- Torchserve v2 REST protocol support by @jagadeeshi2i in #1870
- Update CloudEvent Handling in Python SDK by @markwinter in #1934
- sklearnserver: allow mixed type inputs by @Suresh-Nakkeran in #1972
⚠️ What's Changed
- Rename KF prefixed PythonSDK classes by @markwinter in #1951
KFModel -> Model
KFServer -> ModelServer
KFModelRepository -> ModelRepository - KServe's pytorchserver has been deprecated, for PyTorch model KServe now defaults to use TorchServe serving runtime.
- ONNX runtime server has been deprecated, for ONNX model KServe now defaults to use Triton Inference Server.
⬆️ Version upgrades
- Updated cert-manager to v1 version. by @andyi2it in #1904
- Update ray to 1.9.0 and add ray tests by @markwinter in #1949
- Upgrade sklearnserver to use scikit-learn==1.0.1 and xgbserver to use xgboost==1.5 by @yuzisun in #1954
- Bump mlserver to 0.5.3 by @adriangonz in #1853
- Upgrade Golang to 1.17 by @haoxins in #1962
- Update KServe controller to use Knative 1.0 version by @yuzisun in #1969
- Update lightgbm version by @yuzisun in #2033
- Update helm chart for release v0.8.0 by @Suresh-Nakkeran in #2008
🐞 Fixes
- Explainer service account not being attached issue fix by @Suresh-Nakkeran in #1868
- storage.py: add more logging and an error condition for s3 by @elukey in #1883
- Fix model agent when deployed as raw deployment by @andyi2it in #1891
- Return only JSON responses from KFServer by @markwinter in #1918
- Throw error if storage initializer can not locate PVC source uri by @yuzisun in #1940
- Add content type for transformer to predictor HTTP call #1941 by @caffeinism in #1950
- Allow to set worker count more than 1 by @Suresh-Nakkeran in #1984
- Doc spelling fixes by @jsoref in #1970
- Allow later versions of google-cloud-storage by @aodahl in #1978
- Fixes Azure blob download as class is no longer hashable by @laozc in #1971
- Never ever use the default serviceaccount, but kserve-controller-manager by @juliusvonkohout in #1996
- Fix security issues in kserve manifest by @yuzisun in #1997
- Use non root user for python server images by @yuzisun in #2005
- Fix AIX explainer example pip dependency by @yuzisun in #2021
- Model agent to support custom container port by @andyi2it in #1905
- Support extracting archive with directories by @laozc in #1988
- Fix status condition names by @yuzisun in #2016
- Fix canary rollout falling back to prev rolledout version by @yuzisun in #2040
Full Changelog: v0.7.0...v0.8.0