github InternLM/lmdeploy v0.3.0
LMDeploy Release V0.3.0

latest releases: v0.10.2, v0.10.1, v0.10.0...
19 months ago

Highlight

  • Refactor attention and optimize GQA(#1258 #1307 #1116), achieving 22+ and 16+ RPS for internlm2-7b and internlm2-20b, about 1.8x faster than vLLM
  • Support new models, including Qwen1.5-MOE(#1372), DBRX(#1367), DeepSeek-VL(#1335)

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

Full Changelog: v0.2.6...v0.3.0

Don't miss a new lmdeploy release

NewReleases is sending notifications on new releases.