github oobabooga/text-generation-webui v3.19

5 hours ago

Qwen3-Next llama.cpp support!

Changes

  • Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance (#7316). Thanks, @GodEmperor785.
    • This significantly improves prompt processing speeds for MoE models in both full-GPU and GPU+CPU configurations.

Bug fixes

  • fix(deps): upgrade coqui-tts to >=0.27.0 for transformers 4.55 compatibility (#7329). Thanks, @aidevtime.

Backend updates


Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

  • Windows/Linux:

    • NVIDIA GPU: Use cuda12.4.
    • AMD/Intel GPU: Use vulkan builds.
    • CPU only: Use cpu builds.
  • Mac:

    • Apple Silicon: Use macos-arm64.

Updating a portable install:

  1. Download and unzip the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Don't miss a new text-generation-webui release

NewReleases is sending notifications on new releases.