github jundot/omlx v0.1.5
oMLX v0.1.5

latest releases: v0.2.20, v0.2.20rc1, v0.2.20.dev3...
one month ago

What's New

Prefix cache correctness and reuse

  • Add strict boundary snapshot restore handling for non-sliceable cache layers.
  • Fix exact-hit kickoff behavior to avoid N vs N-1 cache-state mismatch on first decode step.
  • Normalize rotating snapshot state for merge-safe restore behavior.

Walk-back truncation

  • Add walk-back truncation for partial prefix matches to recover the latest valid non-sliceable state block.
  • Extend walk-back support to both ArraysCache and RotatingKVCache.
  • Fix dropped-block ref_count handling during partial reconstruction.

Prefill performance

  • Optimize boundary snapshot prefill chunking so cache-enabled cold prefill avoids excessive boundary splits while preserving boundary-safe captures.

Tests

  • Expand scheduler, prefix cache, and hybrid cache tests for boundary snapshot, exact-hit kickoff, and walk-back scenarios.

Don't miss a new omlx release

NewReleases is sending notifications on new releases.