invoke-ai/InvokeAI v5.6.0rc2 on GitHub

This release brings major improvements to Invoke's memory management, plus a few minor fixes.

Memory Management Improvements (aka Low-VRAM mode)

The goal of these changes is to allow users with low-VRAM GPUs to run even the beefiest models, like the 24GB unquantised FLUX dev model.

Despite the focus on low-VRAM GPUs and the colloquial name "Low-VRAM mode", most users benefit from these improvements to Invoke's memory management.

Low-VRAM mode works on systems with dedicated GPUs (Nvidia GPUs on Windows/Linux and AMD GPUs on Linux). It allows you to generate even if your GPU doesn't have enough VRAM to hold full models.

Low-VRAM mode involves 3 features, each of which can be configured or fine-tuned:

Partial model loading
Dynamic RAM and VRAM cache sizes
Working memory

Most users should only need to enable partial loading by adding this line to your invokeai.yaml file:

enable_partial_loading: true

🚨 Windows users should also disable the Nvidia sysmem fallback.

For more details and instructions for fine-tuning, see the Low-VRAM mode docs.

Thanks to @RyanJDick for designing and implementing these improvements!

Changes since previous release candidate (v5.6.0rc1)

Fix some model loading errors that occurred in edge cases.
Fix error when using DPM++ schedulers with certain models. Thanks @Vargol!
Deprecate the ram and vram settings in favor of new max_cache_ram_gb and max_cache_vram_gb settings. This is eases the upgrade path for users who had manually configured ram and vram in the past.
Fix (maybe, hopefully) the app scrolling off screen when run via launcher.

The launcher itself has also been updated to fix a handful of issues, including requiring an install every time you start the launcher and systems with AMD GPUs using CPU.

Other Changes

Fixed issue where excessively long board names could cause performance issues.
Reworked error handling when installing models from a URL.
Fix error when using DPM++ schedulers with certain models. Thanks @Vargol!
Fix (maybe, hopefully) the app scrolling off screen when run via launcher.
Updated first run screen and OOM error toast with links to Low-VRAM mode docs.
Fixed link to Scale setting's support docs.
Tidied some unused variables. Thanks @rikublock!
Added typegen check to CI pipeline. Thanks @rikublock!
Added stereogram nodes to Community Nodes docs. Thanks @simonfuhrmann!
Updated installation-related docs (quick start, manual install, dev install).
Add Low-VRAM mode docs.

Installing and Updating

The new Invoke Launcher is the recommended way to install, update and run Invoke. It takes care of a lot of details for you - like installing the right version of python - and runs Invoke as a desktop application.

Follow the Quick Start guide to get started with the launcher.

If you already have the launcher, you can use it to update your existing install.

We've just updated the launcher to v1.2.1 with a handful of fixes. To update the launcher itself, download the latest version from the quick start guide - the download links are kept up to date.

Legacy Scripts (not recommended!)

We recommend using the launcher, as described in the previous section!

To install or update with the outdated legacy scripts 😱, download the latest legacy scripts and follow the legacy scripts instructions.

What's Changed

Update Readme with new Installer Instructions by @hipsterusername in #7455
docs: fix installation docs home by @psychedelicious in #7470
docs: fix installation docs home again by @psychedelicious in #7471
feat(ci): add typegen check workflow by @rikublock in #7463
docs: update download links for launcher by @psychedelicious in #7489
Add Stereogram Nodes to communityNodes.md by @simonfuhrmann in #7493
Partial Loading PR1: Tidy ModelCache by @RyanJDick in #7492
Partial Loading PR2: Add utils to support partial loading of models from CPU to GPU by @RyanJDick in #7494
Partial Loading PR3: Integrate 1) partial loading, 2) quantized models, 3) model patching by @RyanJDick in #7500
Correct Scale Informational Popover by @hipsterusername in #7499
docs: install guides by @psychedelicious in #7508
docs: no need to specify version for dev env setup by @psychedelicious in #7510
feat(ui): reset canvas layers only resets the layers by @psychedelicious in #7511
refactor(ui): mm model install error handling by @psychedelicious in #7512
fix(api): limit board_name length to 300 characters by @maryhipp in #7515
fix(app): remove obsolete DEFAULT_PRECISION variable by @rikublock in #7473
Partial Loading PR 3.5: Fix pre-mature model drops from the RAM cache by @RyanJDick in #7522
Partial Loading PR4: Enable partial loading (behind config flag) by @RyanJDick in #7505
Partial Loading PR5: Dynamic cache ram/vram limits by @RyanJDick in #7509
ui: translations update from weblate by @weblate in #7480
chore: bump version to v5.6.0rc1 by @psychedelicious in #7521
Bugfix: Offload of GGML-quantized model in torch.inference_mode() cm by @RyanJDick in #7525
Deprecate ram/vram configs for smoother migration path to dynamic limits by @RyanJDick in #7526
docs: fix pypi indices for manual install for AMD by @psychedelicious in #7528
Bugfix: Do not rely on model.device if model could be partially loaded by @RyanJDick in #7529
Fix for DEIS / DPM++ config clash by setting algorithm type - fixes #6368 by @Vargol in #7440
Whats new 5.6 by @maryhipp in #7527
fix(ui): prevent canvas & main panel content from scrolling by @psychedelicious in #7532
docs,ui: low vram guide & first run blurb by @psychedelicious in #7533
docs: fix incorrect macOS launcher fix command by @psychedelicious in #7536
chore: bump version to v5.6.0rc2 by @psychedelicious in #7538

New Contributors

@simonfuhrmann made their first contribution in #7493

Full Changelog: v5.5.0...v5.6.0rc2