jundot/omlx v0.2.22 on GitHub

v0.2.22 Release Notes

Fix GPU synchronization before batch_generator.remove() in request abort path
Fix prefill performance regression from unnecessary per-chunk _sync_and_clear_cache() calls (#396)
Fix images being stripped from Anthropic tool_result content for VLM models (#393)
Fix GPTQ axis mismatch — align dequantize-quantize grouping with mx.quantize
Fix GPTQ group_size fallback crash on non-power-of-2 output dimensions
Fix accuracy benchmark forcing LM engine to avoid VLM empty responses

Support x-api-key header for Anthropic SDK compatibility (#379)
oQ: MLP asymmetry for dense models — reduce up_proj bits while protecting gate_proj/down_proj
oQ: GPTQ performance and stability improvements, rename enhanced suffix to e