ml-explore/mlx-lm v0.28.4 on GitHub

What's Changed

version by @awni in #559
Add Minimax-M2 by @Blaizzy in #568
Align checkpoint loading with Jamba Mini and Large by @Goekdeniz-Guelmez in #555
Fix dequant + minor refactor by @awni in #572
fix eval thinking by @awni in #578
Fixed typo in load_adapters that broke adapter loading by @jyork03 in #583
Fix AttributeError when loading custom draft models by @kernelpool in #590
Add gen options and CoT removal by @awni in #587
Add parallel_residual setting to gptneox by @spotbot2k in #586
Fixed/improved behavior of the mask_prompt feature. by @jyork03 in #584
add MiniMax-M2 in supported models by @sriting in #575
Fix: Remove call to deleted method by @jyork03 in #591
Make mlx-lm more type-checker friendly by @tnadav in #573
Fix: JSON parse error handling: avoid referencing stream before init by @jyork03 in #592
Adding ring mini linear by @Goekdeniz-Guelmez in #513
Add Kimi Linear by @Blaizzy in #577
DWQ for very large models by @awni in #536
Fix Byte Decoder Lookup for Esoteric Single-Characters by @N8python in #600
Fix input_embeddings prefill bug in generate_step by @Blaizzy in #606
ACKNOWLEDGMENTS.md House keeping by @Goekdeniz-Guelmez in #594
FIX: Add missing sentencepiece dependency for tokenizers by @Deekshith-Dade in #611
switch to github actions by @awni in #618
Fix for kimi k2 by @awni in #593
Allow providing prompt caches in batched generation by @angeloskath in #602
Fix olmo3 by @awni in #628
add support for Trinity/AfMoE model by @ivanfioravanti in #640
Ministral3 by @awni in #642
Add a prompt cache that can hold multiple prompts by @angeloskath in #625
Fix flaky losses test by @awni in #643
Fix lora fusion for non affine quantization by @awni in #647
Batching in the server by @angeloskath in #626
Add deepseek v32 by @awni in #512
version bump by @awni in #649
Fix the release action by @angeloskath in #650

New Contributors

@jyork03 made their first contribution in #583
@spotbot2k made their first contribution in #586
@sriting made their first contribution in #575
@tnadav made their first contribution in #573
@Deekshith-Dade made their first contribution in #611

Full Changelog: v0.28.3...v0.28.4