github InternLM/lmdeploy v0.0.7
LMDeploy Release V0.0.7

latest releases: v0.11.1, v0.11.0, v0.10.2...
2 years ago

Highlights

  • Flash attention 2 is supported, boosting context decoding speed by approximately 45%
  • Token_id decoding has been optimized for better efficiency
  • The gemm-tunned script has been packed in the PyPI package

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

Full Changelog: v0.0.6...v0.0.7

Don't miss a new lmdeploy release

NewReleases is sending notifications on new releases.