oobabooga/text-generation-webui v2.7 on GitHub

✨ Changes

Add ExLlamaV3 support (#6832). This is done through a new ExLlamav3_HF loader that uses the same samplers as Transformers and ExLlamav2_HF. Wheels compiled with GitHub Actions are included for both Linux and Windows, eliminating manual installation steps. Note: these wheels require compute capacity of 8 or greater, at least for now.
- ExLlamaV3 repository: https://github.com/turboderp-org/exllamav3
- Models: https://huggingface.co/turboderp
Add a new chat style: Dark (#6817).
Set context lengths to at most 8192 by default to prevent OOM errors, and show the model's maximum length in the UI (#6835).

Transformers: Bump to 4.50.
CUDA: Bump to 12.4.
PyTorch: Bump to 2.6.0.
FlashAttention: Bump to v2.7.4.post1.
PEFT: Bump to 0.15. This should make axolotl loras compatible with the project.