github jundot/omlx v0.2.3.post3

latest releases: v0.3.8.dev1, v0.3.7, v0.3.7rc2...
one month ago

Hotfix

Bug fixes

  • Fix VLM concurrent request GPU race condition causing TransferEncodingError and server crash (#80)
    • Remove mx.clear_cache() from event loop thread to prevent Metal GPU contention with _mlx_executor during concurrent VLM requests
    • Always synchronize generation_stream on request completion regardless of cache setting (previously skipped when oMLX cache was disabled)
    • Add clear_pending_embeddings() to normal completion path for consistency with abort path

Don't miss a new omlx release

NewReleases is sending notifications on new releases.