github jundot/omlx v0.2.3.post2

latest releases: v0.3.6, v0.3.5, v0.3.5-rc1...
one month ago

Hotfix

Bug fixes

  • Fix VLM multi-request blocking: second request now starts immediately instead of waiting for the first to finish
    • Reverted vision encoding to use _mlx_executor instead of asyncio.to_thread() to avoid Metal GPU thread contention (#80, #81)
    • Changed prefill_batch_size default to prevent continuous batching from being disabled when it equaled completion_batch_size
  • Fix segfault when sending concurrent VLM image requests by ensuring all scheduler steps run on the MLX executor thread (#81)
  • Fix missing mcp package crash on server start
  • Fix memory limit UI showing incorrect label when set to 0

Don't miss a new omlx release

NewReleases is sending notifications on new releases.