github jundot/omlx v0.1.10
oMLX v0.1.10

latest releases: v0.3.6, v0.3.5, v0.3.5-rc1...
one month ago

Highlights

Qwen 3.5 SSD Caching Support

Qwen 3.5's powerful hybrid architecture (GatedDeltaNet + Attention) is now fully supported with SSD caching. Accelerate multi-turn conversations with persistent cache — experience real Agentic Coding on your Mac with oMLX!

What's New

Features

  • Per-block boundary snapshots for hybrid cache models (ArraysCache + KVCache)
  • Auto-enlarge block size (256 → 1024) for ArraysCache models to optimize cache performance and reusability

Bug Fixes

  • Fix SSD cache not being reused across server restarts
  • Fix ArraysCache cache store/restore producing invalid placeholder states in intermediate blocks
  • Fix content array not converted to string in assistant+tool_calls path (#42)
  • Fix PEP 440 non-compliant version string causing pip install -e . failure
  • Fix boundary snapshot OOM during long prefills by offloading to SSD (#48)
  • Fix _BoundarySnapshotProvider missing __len__ preventing paged cache storage
  • Fix shared SchedulerConfig mutation across models causing incorrect block sizes
  • Fix noisy NoneType debug log spam for non-hybrid KVCache models

Other

  • Add brew services support (#43)
  • Add Homebrew upgrade instructions to README

Note

v0.1.9 has been removed due to a critical memory issue (#48) affecting hybrid cache models during long prefills. v0.1.10 includes all v0.1.9 features plus the fixes above.

Don't miss a new omlx release

NewReleases is sending notifications on new releases.