github huggingface/text-generation-inference v2.3.1

latest releases: v3.3.6, v3.3.5, v3.3.4...
13 months ago

Important changes

  • Added support for Mllama (3.2, vision models). Flashed, unpadded.
  • FP8 performance improvements
  • Moe performance improvements
  • BREAKING CHANGE - When using tools, models could answer with a tool call notify_error with the content error, it will instead output regular generation.

What's Changed

New Contributors

Full Changelog: v2.3.0...v2.3.1

Don't miss a new text-generation-inference release

NewReleases is sending notifications on new releases.