What's Changed
- import logging as it throws no logging error in place of actual error by @Maanas-Verma in #778
- server: use OpenAI compatible finish_reason by @percontation in #782
- move Xielu Activation in Apertus to activations.py by @Goekdeniz-Guelmez in #772
- bump transformers by @awni in #746
- Update glm4_moe_lite to store KV latent in cache by @N8python in #780
- Adding TeleChat3 by @Goekdeniz-Guelmez in #773
- add kimi tool parser by @Evanev7 in #791
- Allow qq ops with activation quantization by @awni in #749
- fix: use correct variable for logprobs in batch generation by @LuqDaMan in #800
- Sync random seed across ranks in distributed chat by @kernelpool in #801
- Fix ArraysCache.from_state not initializing left_padding and lengths by @lpalbou in #807
New Contributors
- @Maanas-Verma made their first contribution in #778
- @percontation made their first contribution in #782
- @LuqDaMan made their first contribution in #800
- @lpalbou made their first contribution in #807
Full Changelog: v0.30.4...v0.30.5