Enhancements
- Added support for Hygon DCU. See issue #869.
- Improved resource calculation with KV cache quantization. See issue #1092.
- Added more quantization options for Deepseek R1 in the model catalog. See issue #1123.
- Supported the use of private Hugging Face models. See issue #1093.
- Enabled Markdown rendering in the playground UI. See issue #1125.
Bug Fixes
- Fixed incorrect version reporting when running GPUStack with Docker. See issue #1077.
- Resolved incorrect resource calculation and allocation. See issues #1126, #1136.
- Corrected the arm64 Docker image to use the proper llama-box binary. See issue #1035.
- Addressed a startup failure in GPUStack after changing the timezone in Linux. See issue #1086.
- Various UI fixes and improvements. See issues #1112, #1148, #1080.
Others
- Updated built-in backend versions:
- llama-box updated to v0.0.117
- vLLM updated to v0.7.2