What's Changed
- feat(server): add --model and --adapter-path flags for startup preloading by @auggie246 in #811
- fix(qwen3_5): guard batch dimension mismatches under continuous batching by @kol22 in #813
- Add common sampling parameters to server endpoints by @spicyneuron in #814
- Add an 'index' for each tool call as per OpenAI spec. by @viktike in #818
- Fix Qwen3-Omni (qwen3_omni_moe) integration by @howeirdo in #820
- Add mistral4 by @Blaizzy in #827
- Clean up server + generate args, validation, and defaults by @spicyneuron in #829
- Set custom processor as default and remove torch as dep by @Blaizzy in #821
- Add molmo point by @Blaizzy in #844
New Contributors
- @auggie246 made their first contribution in #811
- @howeirdo made their first contribution in #820
Full Changelog: v0.4.0...v0.4.1