oobabooga/text-generation-webui v3.11
on GitHub

12 days ago

Changes

Add the Tensor Parallelism option to the ExLlamav3/ExLlamav3_HF loaders through the --enable-tp and --tp-backend options.
Set multimodal status during Model Loading instead of checking every generation (#7199). Thanks, @altoiddealer.
Improve the multimodal API examples slightly.

Bug fixes

Make web search functional again
mtmd: Fix a bug when "include past attachments" is unchecked
Fix code blocks having an extra empty line in the UI

Backend updates

Update llama.cpp to ggml-org/llama.cpp@6d7f111
Update ExLlamaV3 to 0.0.6
Update flash-attention to 2.8.3

Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

Windows/Linux:
- NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
- AMD/Intel GPU: Use vulkan builds.
- CPU only: Use cpu builds.
Mac:
- Apple Silicon: Use macos-arm64.
- Intel CPU: Use macos-x86_64.

Updating a portable install:

Download and unzip the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Check out latest releases or
releases around oobabooga/text-generation-webui v3.11

Don't miss a new text-generation-webui release

NewReleases is sending notifications on new releases.

Get notifications