Image generation support!
Changes
- Image generation support: Generate images with
diffusersmodels like Z-Image-Turbo in a new "Image AI" tab. Features include:- 4bit/8bit quantization
torch.compilesupport- LLM-generated prompt variations
- PNG metadata for generation settings
- Gallery for past generations
- Progress bar
- OpenAI-compatible API endpoint for image generation
For a step-by-step tutorial, consult: Image Generation Tutorial
- Pass
bos_tokenandeos_tokento jinja2 templates, making it possible to use the template forSeed-OSS-36B-Instructand other models - Use
flash_attention_2by default for Transformers models
Bug fixes
- Fix API requests always returning the same
createdtime
Backend updates
- Update llama.cpp to https://github.com/ggml-org/llama.cpp/tree/0a540f9abd98915edb99fed47d80078ed8d2f343
- Update ExLlamaV3 to 0.0.17
Portable builds
Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.
Which version to download:
-
Windows/Linux:
- NVIDIA GPU: Use
cuda12.4. - AMD/Intel GPU: Use
vulkanbuilds. - CPU only: Use
cpubuilds.
- NVIDIA GPU: Use
-
Mac:
- Apple Silicon: Use
macos-arm64.
- Apple Silicon: Use
Updating a portable install:
- Download and unzip the latest version.
- Replace the
user_datafolder with the one in your existing install. All your settings and models will be moved.