github InternLM/lmdeploy v0.1.0a2
LMDeploy Release V0.1.0a2

latest releases: v0.10.2, v0.10.1, v0.10.0...
23 months ago

What's Changed

💥 Improvements

  • Unify prefill & decode passes by @lzhangzz in #775
  • add cuda12.1 build check ci by @irexyc in #782
  • auto upload cuda12.1 python pkg to release when create new tag by @irexyc in #784
  • Report the inference benchmark of models with different size by @lvhan028 in #794
  • Add chat template for Yi by @AllentDan in #779

🐞 Bug fixes

  • Fix early-exit condition in attention kernel by @lzhangzz in #788
  • Fix missed arguments when benchmark static inference performance by @lvhan028 in #787
  • fix extra colon in InternLMChat7B template by @C1rN09 in #796
  • Fix local kv head num by @lvhan028 in #806

📚 Documentations

🌐 Other

New Contributors

Full Changelog: v0.1.0a1...v0.1.0a2

Don't miss a new lmdeploy release

NewReleases is sending notifications on new releases.