oobabooga/text-generation-webui v3.14
on GitHub

one day ago

Changes

Better handle multi-GPU setups when using Transformers with bitsandbytes (load-in-8bit and load-in-4bit).
Implement the /v1/internal/logits endpoint for the exllamav3 and exllamav3_hf loaders.
Make profile picture uploading safer.
Add fla to the requirements for Exllamav3 to support qwen3-next models.

Bug fixes

Fix an issue with loading certain chat histories in Instruct mode. Thanks, @Remowylliams.
Fix portable builds for macOS x86 missing llama.cpp binaries (#7238). Thanks, @IonoclastBrigham.

Backend updates

Update llama.cpp to https://github.com/ggml-org/llama.cpp/tree/d00cbea63c671cd85a57adaa50abf60b3b87d86f.
Update transformers to 4.57.
Update exllamav3 0.0.7.
Update bitsandbytes to 0.48.

Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

Windows/Linux:
- NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
- AMD/Intel GPU: Use vulkan builds.
- CPU only: Use cpu builds.
Mac:
- Apple Silicon: Use macos-arm64.
- Intel CPU: Use macos-x86_64.

Updating a portable install:

Download and unzip the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Check out latest releases or
releases around oobabooga/text-generation-webui v3.14

Don't miss a new text-generation-webui release

NewReleases is sending notifications on new releases.

Get notifications