github huggingface/text-generation-inference v0.9.2

latest releases: v3.3.6, v3.3.5, v3.3.4...
2 years ago

Features

  • server: harden a bit the weights choice to save on disk
  • server: better errors for warmup and TP
  • server: Support for env value for GPTQ_BITS and GPTQ_GROUPSIZE
  • server: Implements sharding for non divisible vocab_size
  • launcher: add arg validation and drop subprocess
  • router: explicit warning if revision is not set

Fix

  • server: Fixing RW code (it's remote code so the Arch checking doesn't work to see which weights to keep
  • server: T5 weights names
  • server: Adding logger import to t5_modeling.py by @akowalsk
  • server: Bug fixes for GPTQ_BITS environment variable passthrough by @ssmi153
  • server: GPTQ Env vars: catch correct type of error by @ssmi153
  • server: blacklist local files

New Contributors

Full Changelog: v0.9.1...v0.9.2

Don't miss a new text-generation-inference release

NewReleases is sending notifications on new releases.