🌈 What's New?
In this release we are excited to introduce the new InferenceGraph
feature which has long been asked from the community. Also continuing the effort from the last release for unifying the InferenceService API for deploying models on KServe and ModelMesh, ModelMesh is now fully compatible with KServe InferenceService API!
Core Inference
- Add mlflow support by @Suresh-Nakkeran in #2034
- Add autoscaling target and metric to isvc components by @andyi2it in #2082
- Add ingress class name configuration by @pradithya in #2049
- Add template for generating inference service domain by @pradithya in #2054
- Add Model Status API to isvc by @pvaneck in #2084
- Add logic to update ModelStatus by @Suresh-Nakkeran in #2088
- Allow InferenceService url scheme to be configurable by @markwinter in #2202
Advanced Inference
- Initial inference graph API implementation by @yuzisun @Iamlovingit @njhill #1910
- InferenceGraph python sdk by @andyi2it in #2341
- Enable transformers to work with ModelMesh by @chinhuang007 in #2136
Model Storage Provider
- Introduce new storage spec for unified configuration by @Tomcli in #1899
- Add Azure file share support by @laozc @Suresh-Nakkeran in #2180
- Support webhdfs in storageURI and storage spec by @markwinter in #2077
Serving Runtime
- Add
protocolversion
in servingruntime spec by @Suresh-Nakkeran in #2118 - Add Volumes to ServingRuntimePodSpec; allow other built-in ServerTypes by @njhill in #2147
- Allow more fields in servingruntime container spec by @Suresh-Nakkeran in #2112
- Add env field to ServingRuntime builtInAdapter settings by @njhill in #2123
⚠️ What's Changed
- Convert kserve manager from statefulset to deployment to make HA by @Suresh-Nakkeran in #2160 #2348
statefulset will be removed in 0.10
🐞 Fixes
- Update ray to 1.10 for log4j security vulnerability fix by @markwinter @andyi2it in #2056 #2322
- Runtimes installation issue fix by @Suresh-Nakkeran in #2071
- Fix: replace image tag issue by @ittus in #2074
- Fix status RestURL type by @pvaneck in #2121
- Use default port for raw deployment as fail safe by @andyi2it in #2116
- Fix canary rollout falling back to previously rolled out version by @wenyangchou in #2097
- Fix: predict address url on status object by @Suresh-Nakkeran in #2146
- Fix downloading model with nested sub folders from gcs by @andyi2it in #2152
- Delete only the trainedmodel in the namespace where the isvc by @hehe04 in #2166
- Loosen setuptools lower pin by @ddelange in #2231
- Fix KServe dependencies conflicts with KFP by @safoinme in #2295
- Update leader-election-role name by @pvaneck in #2296
- Fix HPA Scaling for Raw Deployment mode by @Iamlovingit in #2279
- Use enum value in request validation by @luranhe in #2249
- Fix: Prevent all predictor defaulting with MM by @pvaneck in #2288
- Fix incorrect function definition, issue #2258 by @eyalcha in #2266
⬆️ Version upgrades
- go mod: Upgrade to ginkgo v2 by @haoxins in #2062
- upgrade alibi version to 0.6.4 by @Suresh-Nakkeran in #2092
- Upgrade kserve python dep by @Suresh-Nakkeran in #2103
- Bump torchserve version to 0.6.0 by @jagadeeshi2i in #2214
- Update python dep for kserve sdk by @yuzisun in #2216
- Upgrade mlserver version to 1.0.0 by @Suresh-Nakkeran in #2262
📖 Documentation
- TorchServe - KServe v2 - Examples update by @shrinath-suresh in #2035
- TorchServe - KServe v2 - bert explanation by @shrinath-suresh in #2043
- Fix the required K8s version comments by @haoxins in #2076
- Fix triton + torchscript guidelines to be executable by @Curt-Park in #2091
- fix: configmap url in sklearn example by @ittus in #2072
- Fix e2e test example gcs bucket by @yuzisun in #2134
- Update docs/samples/pipelines/ by @georgetree in #2122
- Update Infer proto docs to mention BFloat16 type by @rmccorm4 in #2159
- Add cherry pick script and document cherrypick process by @yuzisun in #2153
- Made changes to update sdk and docs by running code-gen by @andyi2it in #2162
- Update documentation for getting Prometheus metrics. by @shrinandj in #2171
- Update deprecated gcs bucket by @yuzisun in #2215
- Update Feast transformer to support ModelMesh by @chinhuang007 in #2204
- update graph sample by @Iamlovingit in #2223
⚒️ Developer Experience
- Move ComponentExtensionSpec validation to own test file by @markwinter in #2110
- Add default container annotation by @haoxins in #2124
- update kserve manager kind as deployment in helm chart by @Suresh-Nakkeran in #2172
- Fix kubectl version compatibility issue in post e2e test script by @yuzisun in #2175
- Add aix-explainer deploy command by @Cheng8994 in #2174
- Added a specific version for protobuf by @andyi2it in #2201
- Fix controller manager image patch in CI by @yuzisun in #2199
- Remove presubmit tests depending on optional-test-infra by @aws-kf-ci-bot in #2194
- chore: Update E2E tests to use GH Actions by @pvaneck in #2206
- Add install script for KServe and ModelMesh by @chinhuang007 in #2032
- Publish helm chart as release asset by @ddelange in #2189
- e2e test enhancement changes by @andyi2it in #2237
New Contributors
- @shrinath-suresh made their first contribution in #2035
- @ittus made their first contribution in #2074
- @Curt-Park made their first contribution in #2091
- @georgetree made their first contribution in #2122
- @wenyangchou made their first contribution in #2097
- @rmccorm4 made their first contribution in #2159
- @hehe04 made their first contribution in #2166
- @Cheng8994 made their first contribution in #2174
- @shrinandj made their first contribution in #2171
- @aws-kf-ci-bot made their first contribution in #2194
- @ddelange made their first contribution in #2189
- @luranhe made their first contribution in #2249
- @eyalcha made their first contribution in #2266
- @safoinme made their first contribution in #2295
Full Changelog: v0.8.0...v0.9.0