What's new in 0.4.2 (2023-09-15)
These are the changes in inference v0.4.2.
New features
- FEAT: concurrent generation by @codingl2k1 in #417
- FEAT: Support gguf by @aresnow1 in #446
- FEAT: Support OpenBuddy by @codingl2k1 in #444
Enhancements
- ENH: client support desc model by @UranusSeven in #442
- ENH: caching from self-hosted storage by @UranusSeven in #419
- ENH: Assign worker sub pool at runtime instead of pre-allocated by @ChengjieLi28 in #437
- ENH: add benchmark script by @UranusSeven in #451
Bug fixes
- BUG: Fix restful client for embedding models by @aresnow1 in #439
- BUG: cmdline double line breaker by @UranusSeven in #441
- BUG: no error raised on unsupported fmt by @UranusSeven in #443
- BUG: Xinferecen list failed if embedding models are launched by @aresnow1 in #452
Tests
- TST: skip self-hosted storage tests by @UranusSeven in #453
Documentation
- DOC: fix baichuan-2 and make naming consistent by @UranusSeven in #432
- DOC: update hot topics by @UranusSeven in #456
Others
- CI: Fix Windows CI by @codingl2k1 in #440
New Contributors
- @ChengjieLi28 made their first contribution in #437
Full Changelog: v0.4.1...v0.4.2