github jundot/omlx v0.2.22

5 hours ago

v0.2.22 Release Notes

Bug Fixes

  • Fix GPU synchronization before batch_generator.remove() in request abort path
  • Fix prefill performance regression from unnecessary per-chunk _sync_and_clear_cache() calls (#396)
  • Fix images being stripped from Anthropic tool_result content for VLM models (#393)
  • Fix GPTQ axis mismatch — align dequantize-quantize grouping with mx.quantize
  • Fix GPTQ group_size fallback crash on non-power-of-2 output dimensions
  • Fix accuracy benchmark forcing LM engine to avoid VLM empty responses

Improvements

  • Support x-api-key header for Anthropic SDK compatibility (#379)
  • oQ: MLP asymmetry for dense models — reduce up_proj bits while protecting gate_proj/down_proj
  • oQ: GPTQ performance and stability improvements, rename enhanced suffix to e

Don't miss a new omlx release

NewReleases is sending notifications on new releases.