This release adds a setting to reduce peak VRAM usage and improve performance, plus a few other fixes and enhancements.
Changes since the last RC
We've made a series of changes to allow the enqueue operation to be non-blocking. For large batches, this improves the responsiveness of the app between the time when you click the Invoke button until the progress bar starts moving.
Please let us know if you encounter any new errors.
Memory Management Improvements
By default, Invoke uses pytorch
's own memory allocator to load and manage models in VRAM. CUDA also provides a memory allocator, and on many systems, the CUDA allocator outperforms the pytorch
allocator, reducing peak VRAM usage. On some systems, this may improve generation speeds.
You can use the new pytorch_cuda_alloc_conf
setting in invokeai.yaml
to opt-in to CUDA's memory allocator:
pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"
If you do not add this setting, Invoke will continue to use the pytorch
allocator (same as it always has).
There are other possible configurations you can use for this setting, dictated by pytorch
. Refer to the new section in the Low-VRAM mode docs for more information.
Other Changes
- You may now upload WEBP images to Invoke. They will be converted to PNGs for use within the application. Thanks @keturn!
- More conservative estimates for VAE VRAM usage. This aims to reduce the slowdowns and OOMs on the VAE decode step.
- Fixed "single or collection" field type rendering in the Workflow Editor. This was causing fields like IP Adapter's images and ControlNet's control weights from displaying a widget.
- Fixed the download button in the Workflow Library list, which was downloading the active workflow instead of the workflow for which the button was clicked.
- Enqueuing a batch (i.e. what happens when you click the Invoke button) is now a non-blocking operation, allowing the app to be more responsive immediately after clicking Invoke. To enable this improvement, we migrated from using a global mutex for DB access with long-lived SQLite cursors to WAL mode with short-lived SQLite cursors. This is expected to afford a minor (likely not noticeable) performance boost in the backend in addition to the responsiveness improvement.
Installing and Updating
The new Invoke Launcher is the recommended way to install, update and run Invoke. It takes care of a lot of details for you - like installing the right version of python - and runs Invoke as a desktop application.
Follow the Quick Start guide to get started with the launcher.
If you don't want to use the launcher, or need a headless install, you can follow the manual install guide.
What's Changed
- Tidy app entrypoint by @RyanJDick in #7668
- Do not cache image layers in CI docker build by @ebr in #7712
- Add
pytorch_cuda_alloc_conf
config to tune VRAM memory allocation by @RyanJDick in #7673 - Increase VAE decode memory estimates by @RyanJDick in #7674
- fix(ui): download button in workflow library downloads wrong workflow by @psychedelicious in #7715
- docs: update RELEASE.md by @psychedelicious in #7707
- fix(ui): single or collection field rendering by @psychedelicious in #7714
- feat: accept WebP uploads for assets by @keturn in #7718
- chore: bump version to v5.7.2rc1 by @psychedelicious in #7721
- feat(app): non blocking enqueue_batch by @psychedelicious in #7724
- fix(ui): add missing builder translations by @psychedelicious in #7723
- ui: translations update from weblate by @weblate in #7722
- fix(app): recursive cursor errors by @psychedelicious in #7727
- chore: bump version to v5.7.2rc2 by @psychedelicious in #7725
Full Changelog: v5.7.1...v5.7.2rc2