github jundot/omlx v0.1.13

latest releases: v0.3.5, v0.3.5-rc1, v0.3.5.dev1...
one month ago

Highlight: In-Memory Hot Caching

Introducing an in-memory hot cache tier for KV cache blocks. Frequently accessed blocks stay in RAM for faster access, and SSD storage is used only when the hot cache reaches its capacity limit.

Configure it via --hot-cache-max-size CLI option or the admin web UI slider under Resource Management.

What's changed

  • feat: Add in-memory hot cache with write-back mode (#58)
  • fix: Merge consecutive same-role messages to prevent 500 error (#53)
  • fix: Include forced_ct_kwargs in model list API response
  • fix: Update outdated test assertions for SchedulerConfig defaults and CORS middleware
  • ui: Improve global settings host selector and move batching to advanced
  • chore: Change default top_k from 40 to 0

Full changelog: v0.1.12...v0.1.13

Don't miss a new omlx release

NewReleases is sending notifications on new releases.