What's Changed
- Add
mlx_lm.perplexity
by @N8python in #397 - Benchmark script by @awni in #396
- Don't reload default model by @awni in #400
- only apply lm_head to the last token by @awni in #406
- Fix prompt cache corruption when generation is interrupted by @dojoteef in #405
- support mxfp4 by @awni in #385
- version by @awni in #410
Full Changelog: v0.26.4...v0.27.0