github InternLM/lmdeploy v0.5.2
LMDeploy Release V0.5.2

latest releases: v0.10.2, v0.10.1, v0.10.0...
16 months ago

Highlight

  • LMDeploy support Llama3.1 and its Tool Calling. An example of calling "Wolfram Alpha" to perform complex mathematical calculations can be found from here

What's Changed

🚀 Features

💥 Improvements

  • Remove the triton inference server backend "turbomind_backend" by @lvhan028 in #1986
  • Remove kv cache offline quantization by @AllentDan in #2097
  • Remove session_len and deprecated short names of the chat templates by @lvhan028 in #2105
  • clarify "n>1" in GenerationConfig hasn't been supported yet by @lvhan028 in #2108

🐞 Bug fixes

🌐 Other

Full Changelog: v0.5.1...v0.5.2

Don't miss a new lmdeploy release

NewReleases is sending notifications on new releases.