jundot/omlx v0.1.14.post3 on GitHub

v0.1.14.post3 (Hotfix)

Fix model memory not being freed after engine unload (#62)
- When a model was evicted (via LRU, TTL expiration, or memory enforcer), the engine and scheduler retained direct Python references to the model weights, preventing garbage collection from reclaiming GPU memory
- This caused mx.get_active_memory() to report stale high values, blocking subsequent model loads with "projected memory would exceed limit" errors