What's Changed
- version by @awni in #559
- Add Minimax-M2 by @Blaizzy in #568
- Align checkpoint loading with Jamba Mini and Large by @Goekdeniz-Guelmez in #555
- Fix dequant + minor refactor by @awni in #572
- fix eval thinking by @awni in #578
- Fixed typo in
load_adaptersthat broke adapter loading by @jyork03 in #583 - Fix AttributeError when loading custom draft models by @kernelpool in #590
- Add gen options and CoT removal by @awni in #587
- Add parallel_residual setting to gptneox by @spotbot2k in #586
- Fixed/improved behavior of the mask_prompt feature. by @jyork03 in #584
- add MiniMax-M2 in supported models by @sriting in #575
- Fix: Remove call to deleted method by @jyork03 in #591
- Make mlx-lm more type-checker friendly by @tnadav in #573
- Fix: JSON parse error handling: avoid referencing stream before init by @jyork03 in #592
- Adding ring mini linear by @Goekdeniz-Guelmez in #513
- Add Kimi Linear by @Blaizzy in #577
- DWQ for very large models by @awni in #536
- Fix Byte Decoder Lookup for Esoteric Single-Characters by @N8python in #600
- Fix input_embeddings prefill bug in generate_step by @Blaizzy in #606
- ACKNOWLEDGMENTS.md House keeping by @Goekdeniz-Guelmez in #594
- FIX: Add missing sentencepiece dependency for tokenizers by @Deekshith-Dade in #611
- switch to github actions by @awni in #618
- Fix for kimi k2 by @awni in #593
- Allow providing prompt caches in batched generation by @angeloskath in #602
- Fix olmo3 by @awni in #628
- add support for Trinity/AfMoE model by @ivanfioravanti in #640
- Ministral3 by @awni in #642
- Add a prompt cache that can hold multiple prompts by @angeloskath in #625
- Fix flaky losses test by @awni in #643
- Fix lora fusion for non affine quantization by @awni in #647
- Batching in the server by @angeloskath in #626
- Add deepseek v32 by @awni in #512
- version bump by @awni in #649
- Fix the release action by @angeloskath in #650
New Contributors
- @jyork03 made their first contribution in #583
- @spotbot2k made their first contribution in #586
- @sriting made their first contribution in #575
- @tnadav made their first contribution in #573
- @Deekshith-Dade made their first contribution in #611
Full Changelog: v0.28.3...v0.28.4