github huggingface/text-generation-inference v0.8.0

latest releases: v3.3.7, v3.3.6, v3.3.5...
2 years ago

Features

  • router: support vectorized warpers in flash causal lm (co-authored by @jlamypoirier )
  • proto: decrease IPC proto size
  • benchmarker: add summary tables
  • server: support RefinedWeb models

Fix

  • server: Fix issue when load AutoModelForSeq2SeqLM model (contributed by @CL-Shang)

New Contributors

Full Changelog: v0.7.0...v0.8.0

Don't miss a new text-generation-inference release

NewReleases is sending notifications on new releases.