Changes
Bug fixes
- Fixed python requirements for apple devices with macos tahoe (#7273). Thanks, @drieschel.
Backend updates
- Update llama.cpp to https://github.com/ggml-org/llama.cpp/tree/d0660f237a5c31771a3d6d1030ebe3e0c409ba92 (adds Ling-mini-2.0, Ring-mini-2.0 support)
- Update exllamav3 to 0.0.11
- Update triton-windows to 3.5.0.post21
Portable builds
Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.
Which version to download:
-
Windows/Linux:
- NVIDIA GPU: Use
cuda12.4for newer GPUs orcuda11.7for older GPUs and systems with older drivers. - AMD/Intel GPU: Use
vulkanbuilds. - CPU only: Use
cpubuilds.
- NVIDIA GPU: Use
-
Mac:
- Apple Silicon: Use
macos-arm64. - Intel CPU: Use
macos-x86_64.
- Apple Silicon: Use
Updating a portable install:
- Download and unzip the latest version.
- Replace the
user_datafolder with the one in your existing install. All your settings and models will be moved.