github vllm-project/vllm v0.3.0

latest releases: v0.6.3.post1, v0.6.3, v0.6.2...
9 months ago

Major Changes

  • Experimental multi-lora support
  • Experimental prefix caching support
  • FP8 KV Cache support
  • Optimized MoE performance and Deepseek MoE support
  • CI tested PRs
  • Support batch completion in server

What's Changed

New Contributors

Full Changelog: v0.2.7...v0.3.0

Don't miss a new vllm release

NewReleases is sending notifications on new releases.