github oobabooga/text-generation-webui v3.8

latest releases: v3.12, v3.11, v3.10...
one month ago

Changes

  • Replace use_flash_attention_2/use_eager_attention with a unified attn_implementation in the Transformers loader
  • Ignore add_bos_token in instruct prompts, let the jinja2 template decide
  • Add a "None" option for the speculative decoding model

Backend updates


Portable builds

Below you can find portable builds: self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

  • Windows/Linux:

    • NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
    • AMD/Intel GPU: Use vulkan builds.
    • CPU only: Use cpu builds.
  • Mac:

    • Apple Silicon: Use macos-arm64.
    • Intel CPU: Use macos-x86_64.

Updating a portable install:

  1. Download and unzip the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Don't miss a new text-generation-webui release

NewReleases is sending notifications on new releases.