invoke-ai/InvokeAI v5.7.2rc2 on GitHub

This release adds a setting to reduce peak VRAM usage and improve performance, plus a few other fixes and enhancements.

Changes since the last RC

We've made a series of changes to allow the enqueue operation to be non-blocking. For large batches, this improves the responsiveness of the app between the time when you click the Invoke button until the progress bar starts moving.

Please let us know if you encounter any new errors.

Memory Management Improvements

By default, Invoke uses pytorch's own memory allocator to load and manage models in VRAM. CUDA also provides a memory allocator, and on many systems, the CUDA allocator outperforms the pytorch allocator, reducing peak VRAM usage. On some systems, this may improve generation speeds.

You can use the new pytorch_cuda_alloc_conf setting in invokeai.yaml to opt-in to CUDA's memory allocator:

pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"

If you do not add this setting, Invoke will continue to use the pytorch allocator (same as it always has).

There are other possible configurations you can use for this setting, dictated by pytorch. Refer to the new section in the Low-VRAM mode docs for more information.

Other Changes

You may now upload WEBP images to Invoke. They will be converted to PNGs for use within the application. Thanks @keturn!
More conservative estimates for VAE VRAM usage. This aims to reduce the slowdowns and OOMs on the VAE decode step.
Fixed "single or collection" field type rendering in the Workflow Editor. This was causing fields like IP Adapter's images and ControlNet's control weights from displaying a widget.
Fixed the download button in the Workflow Library list, which was downloading the active workflow instead of the workflow for which the button was clicked.
Enqueuing a batch (i.e. what happens when you click the Invoke button) is now a non-blocking operation, allowing the app to be more responsive immediately after clicking Invoke. To enable this improvement, we migrated from using a global mutex for DB access with long-lived SQLite cursors to WAL mode with short-lived SQLite cursors. This is expected to afford a minor (likely not noticeable) performance boost in the backend in addition to the responsiveness improvement.

Installing and Updating

The new Invoke Launcher is the recommended way to install, update and run Invoke. It takes care of a lot of details for you - like installing the right version of python - and runs Invoke as a desktop application.

Follow the Quick Start guide to get started with the launcher.

If you don't want to use the launcher, or need a headless install, you can follow the manual install guide.

What's Changed

Tidy app entrypoint by @RyanJDick in #7668
Do not cache image layers in CI docker build by @ebr in #7712
Add pytorch_cuda_alloc_conf config to tune VRAM memory allocation by @RyanJDick in #7673
Increase VAE decode memory estimates by @RyanJDick in #7674
fix(ui): download button in workflow library downloads wrong workflow by @psychedelicious in #7715
docs: update RELEASE.md by @psychedelicious in #7707
fix(ui): single or collection field rendering by @psychedelicious in #7714
feat: accept WebP uploads for assets by @keturn in #7718
chore: bump version to v5.7.2rc1 by @psychedelicious in #7721
feat(app): non blocking enqueue_batch by @psychedelicious in #7724
fix(ui): add missing builder translations by @psychedelicious in #7723
ui: translations update from weblate by @weblate in #7722
fix(app): recursive cursor errors by @psychedelicious in #7727
chore: bump version to v5.7.2rc2 by @psychedelicious in #7725

Full Changelog: v5.7.1...v5.7.2rc2