What's Changed
- upgrade vllm/transformers version by @johnugeorge in #3671
- Add openai models endpoint by @cmaddalozzo in #3666
- feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 by @terrytangyuan in #3603
- Enable dtype support for huggingface server by @Datta0 in #3613
- Add method for checking model health/readiness by @cmaddalozzo in #3673
- fix for extract zip from gcs by @andyi2it in #3510
- Update Dockerfile and Readme by @gavrishp in #3676
- Update huggingface readme by @alexagriffith in #3678
- fix: HPA equality check should include annotations by @terrytangyuan in #3650
- Fix: huggingface runtime in helm chart by @yuzisun in #3679
- Fix: model id and model dir check order by @yuzisun in #3680
- Fix:vLLM Model Supported check throwing circular dependency by @gavrishp in #3688
- Fix: Allow null in Finish reason streaming response in vLLM by @gavrishp in #3684
- Unify the log configuration using kserve logger by @sivanantha321 in #3577
- Remove conversion webhook from kubeflow manifest patch by @sivanantha321 in #3700
- Add the field ResponseStartTimeoutSeconds to create ksvc by @houshengbo in #3705
New Contributors
Full Changelog: v0.13.0-rc0...v0.13.0-rc1