What's Changed
- fix dynamic quant for bias by @awni in #216
- feat: add MiniCPM4 model structure code minicpm4.py and minicpm4 mode… by @pzc163 in #212
- Implementation of AFM in MLX by @angeloskath in #232
- Support cuda back-end by @awni in #241
- Fx cast predicate by @awni in #243
- Pipe input_embeddings through mistral3 model_type by @will-lms in #254
- Fix
IndexError
inNaiveStreamingDetokenizer
by @Muhtasham in #253 - Enable tool use in the server and add an example using openai client by @awni in #217
- Allow models to be pickled + test by @awni in #261
- Gemma3n text only support by @awni in #258
- Add Ernie4.5 by @johnmai-dev in #263
- Allow converting local models by @awni in #265
- Adding support for rednote-hilab/dots.llm1.inst by @Goekdeniz-Guelmez in #211
- Add bitnet1.58 with custom metal kernel by @Blaizzy in #219
- Adding ernie4.5 moe by @Goekdeniz-Guelmez in #267
- Add Hunyuan-A13B-Instruct MoE support by @ivanfioravanti in #273
- Parse JSON arguments when OpenAI tool calling by @crodjer in #271
- Add SmolLM3. by @Vaibhavs10 in #272
- Feat: add falcon-e support for bitnet models by @younesbelkada in #268
- Allow generation without README by @will-lms in #278
- Allow prompt and input_embeddings by @will-lms in #266
- KL loss and memory improvements for DWQ and dynamic quant by @angeloskath in #280
- remove sentencepiece by @awni in #282
- Automate PyPi upload by @awni in #283
New Contributors
- @pzc163 made their first contribution in #212
- @Muhtasham made their first contribution in #253
- @crodjer made their first contribution in #271
- @Vaibhavs10 made their first contribution in #272
- @younesbelkada made their first contribution in #268
Full Changelog: v0.25.1...v0.26.0