github huggingface/diffusers v0.37.0
Diffusers 0.37.0: Modular Diffusers, New image and video pipelines, multiple core library improvements, and more 🔥

8 hours ago

Modular Diffusers

Modular Diffusers introduces a new way to build diffusion pipelines by composing reusable blocks. Instead of writing entire pipelines from scratch, you can now mix and match building blocks to create custom workflows tailored to your specific needs! This complements the existing DiffusionPipeline class, providing a more flexible way to create custom diffusion pipelines.

Find more details on how to get started with Modular Diffusers here, and also check out the announcement post.

New Pipelines and Models

Image 🌆

  • Z Image Omni Base: Z-Image is the foundation model of the Z-Image family, engineered for good quality, robust generative diversity, broad stylistic coverage, and precise prompt adherence. While Z-Image-Turbo is built for speed, Z-Image is a full-capacity, undistilled transformer designed to be the backbone for creators, researchers, and developers who require the highest level of creative freedom. Thanks to @RuoyiDufor for contributing this in #12857.
  • Flux2 Klein:FLUX.2 [Klein] unifies generation and editing in a single compact architecture, delivering state-of-the-art quality with end-to-end inference in as low as under a second. Built for applications that require real-time image generation without sacrificing quality, and runs on consumer hardware, with as little as 13GB VRAM.
  • Qwen Image Layered: Qwen-Image-Layered is a model capable of decomposing an image into multiple RGBA layers. This layered representation unlocks inherent editability: each layer can be independently manipulated without affecting other content. Thanks to @naykun for contributing this in #12853.
  • FIBO Edit: Fibo Edit is an 8B parameter image-to-image model that introduces a new paradigm of structured control, operating on JSON inputs paired with source images to enable deterministic and repeatable editing workflows. Featuring native masking for granular precision, it moves beyond simple prompt-based diffusion to offer explicit, interpretable control optimized for production environments. Its lightweight architecture is designed for deep customization, empowering researchers to build specialized “Edit” models for domain-specific tasks while delivering top-tier aesthetic quality. Thanks galbria for contributing it in #12930.
  • Cosmos Predict2.5: Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world. Thanks to @miguelmartin75 for contributing it in #12852.
  • Cosmos Transfer2.5: Cosmos-Transfer2.5 is a conditional world generation model with adaptive multimodal control, that produces high-quality world simulations conditioned on multiple control inputs. These inputs can take different modalities—including edges, blurred video, segmentation maps, and depth maps. Thanks to @miguelmartin75 for contributing it in #13066.
  • GLM-Image: GLM-Image is an image generation model adopts a hybrid autoregressive + diffusion decoder architecture, effectively pushing the upper bound of visual fidelity and fine-grained details. In general image generation quality, it aligns with industry-standard LDM-based approaches, while demonstrating significant advantages in knowledge-intensive image generation scenarios. Thanks to @zRzRzRzRzRzRzR for contributing it in #12973.
  • RAE: Representation Autoencoders (aka RAE) are an exciting alternative to traditional VAEs, typically used in the area of latent-space diffusion models of image generation. RAEs leverage pre-trained vision encoders and train lightweight decoders for the task of reconstruction.

Video + audio 🎥 🎼

  • LTX-2: LTX-2 is an audio-conditioned text-to-video generation model that can generate videos with synced audio. Full and distilled model inference, as well as two-stage inference with spatial sampling, is supported. We also support a conditioning pipeline that allows for passing different conditions (such as images, series of images, etc.). Check out the docs to learn more!
  • Helios: Helios is a 14B video generation model that runs at 17 FPS on a single NVIDIA H100 GPU and supports minute-scale generation while matching a strong baseline in quality. Thanks to @SHYuanBest for contributing this in #13208.

Improvements to Core Library

New caching methods

New context-parallelism (CP) backends

Misc

  • Mambo-G Guidance: New guider implementation (#12862)
  • Laplace Scheduler for DDPM (#11320)
  • Custom Sigmas in UniPCMultistepScheduler (#12109)
  • MultiControlNet support for SD3 Inpainting (#11251)
  • Context parallel in native flash attention (#12829)
  • NPU Ulysses Attention Support (#12919)
  • Fix Wan 2.1 I2V Context Parallel Inference (#12909)
  • Fix Qwen-Image Context Parallel Inference (#12970)
  • Introduction to @apply_lora_scale decorator for simplifying model definitions (#12994)
  • Introduction of pipeline-level “cpu” device_map (#12811)
  • Enable CP for kernels-based attention backends (#12812)
  • Diffusers is fully functional with Transformers V5 (#12976)

A lot of the above features/improvements came as part of the MVP program we have been running. Immense thanks to the contributors!

Bug Fixes

  • Fix QwenImageEditPlus on NPU (#13017)
  • Fix MT5Tokenizer → use T5Tokenizer for Transformers v5.0+ compatibility (#12877)
  • Fix Wan/WanI2V patchification (#13038)
  • Fix LTX-2 inference with num_videos_per_prompt > 1 and CFG (#13121)
  • Fix Flux2 img2img prediction (#12855)
  • Fix QwenImage txt_seq_lens handling (#12702)
  • Fix prefix_token_len bug (#12845)
  • Fix ftfy imports in Wan and SkyReels-V2 (#12314, #13113)
  • Fix is_fsdp determination (#12960)
  • Fix GLM-Image get_image_features API (#13052)
  • Fix Wan 2.2 when either transformer isn't present (#13055)
  • Fix guider issue (#13147)
  • Fix torchao quantizer for new versions (#12901)
  • Fix GGUF for unquantized types with unquantize kernels (#12498)
  • Make Qwen hidden states contiguous for torchao (#13081)
  • Make Flux hidden states contiguous (#13068)
  • Fix Kandinsky 5 hardcoded CUDA autocast (#12814)
  • Fix aiter availability check (#13059)
  • Fix attention mask check for unsupported backends (#12892)
  • Allow prompt and prior_token_ids simultaneously in GlmImagePipeline (#13092)
  • GLM-Image batch support (#13007)
  • Cosmos 2.5 Video2World frame extraction fix (#13018)
  • ResNet: only use contiguous in training mode (#12977)

All commits

  • [PRX] Improve model compilation by @WaterKnight1998 in #12787
  • Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py by @delmalih in #12798
  • [Modular]z-image by @yiyixuxu in #12808
  • Fix Qwen Edit Plus modular for multi-image input by @sayakpaul in #12601
  • [WIP] Add Flux2 modular by @DN6 in #12763
  • [docs] improve distributed inference cp docs. by @sayakpaul in #12810
  • post release 0.36.0 by @sayakpaul in #12804
  • Update distributed_inference.md to correct syntax by @sayakpaul in #12827
  • [lora] Remove lora docs unneeded and add " # Copied from ..." by @sayakpaul in #12824
  • support CP in native flash attention by @sywangyi in #12829
  • [qwen-image] edit 2511 support by @naykun in #12839
  • fix pytest tests/pipelines/pixart_sigma/test_pixart.py::PixArtSigmaPi… by @sywangyi in #12842
  • Support for control-lora by @lavinal712 in #10686
  • Add support for LongCat-Image by @junqiangwu in #12828
  • fix the prefix_token_len bug by @junqiangwu in #12845
  • extend TorchAoTest::test_model_memory_usage to other platform by @sywangyi in #12768
  • Qwen Image Layered Support by @naykun in #12853
  • Z-Image-Turbo ControlNet by @hlky in #12792
  • Cosmos Predict2.5 Base: inference pipeline, scheduler & chkpt conversion by @miguelmartin75 in #12852
  • more update in modular by @yiyixuxu in #12560
  • Feature: Add Mambo-G Guidance as Guider by @MatrixTeam-AI in #12862
  • Add OvisImagePipeline in AUTO_TEXT2IMAGE_PIPELINES_MAPPING by @alvarobartt in #12876
  • Cosmos Predict2.5 14b Conversion by @miguelmartin75 in #12863
  • Use T5Tokenizer instead of MT5Tokenizer (removed in Transformers v5.0+) by @alvarobartt in #12877
  • Add z-image-omni-base implementation by @RuoyiDu in #12857
  • fix torchao quantizer for new torchao versions by @vkuzo in #12901
  • fix Qwen Image Transformer single file loading mapping function to be consistent with other loader APIs by @mbalabanski in #12894
  • Z-Image-Turbo from_single_file fix by @hlky in #12888
  • chore: fix dev version in setup.py by @DefTruth in #12904
  • Community Pipeline: Add z-image differential img2img by @r4inm4ker in #12882
  • Fix typo in src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_predict.py by @miguelmartin75 in #12914
  • Fix wan 2.1 i2v context parallel by @DefTruth in #12909
  • fix the use of device_map in CP docs by @sayakpaul in #12902
  • [core] remove unneeded autoencoder methods when subclassing from AutoencoderMixin by @sayakpaul in #12873
  • Detect 2.0 vs 2.1 ZImageControlNetModel by @hlky in #12861
  • Refactor environment variable assignments in workflow by @paulinebm in #12916
  • Add codeQL workflow by @paulinebm in #12917
  • Delete .github/workflows/codeql.yml by @paulinebm (direct commit on v0.37.0-release)
  • CodeQL workflow for security analysis by @paulinebm (direct commit on v0.37.0-release)
  • Check for attention mask in backends that don't support it by @dxqb in #12892
  • [Flux.1] improve pos embed for ascend npu by computing on npu by @zhangtao0408 in #12897
  • LTX Video 0.9.8 long multi prompt by @yaoqih in #12614
  • Add FSDP option for Flux2 by @leisuzz in #12860
  • Add transformer cache context for SkyReels-V2 pipelines & Update docs by @tolgacangoz in #12837
  • [docs] fix torchao typo. by @sayakpaul in #12883
  • Update wan.md to remove unneeded hfoptions by @sayakpaul in #12890
  • Improve docstrings and type hints in scheduling_edm_euler.py by @delmalih in #12871
  • [Modular] Video for Mellon by @asomoza in #12924
  • Add LTX 2.0 Video Pipelines by @dg845 in #12915
  • Add environment variables to checkout step by @paulinebm in #12927
  • Improve docstrings and type hints in scheduling_consistency_decoder.py by @delmalih in #12928
  • Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning by @adi776borate in #12814
  • Upgrade GitHub Actions for Node 24 compatibility by @salmanmkc in #12865
  • fix the warning torch_dtype is deprecated by @msdsm in #12841
  • [NPU] npu attention enable ulysses by @TmacAaron in #12919
  • Torchao floatx version guard by @howardzhang-cv in #12923
  • Bugfix for dreambooth flux2 img2img2 by @leisuzz in #12825
  • [Modular] qwen refactor by @yiyixuxu in #12872
  • [modular] Tests for custom blocks in modular diffusers by @sayakpaul in #12557
  • [chore] remove controlnet implementations outside controlnet module. by @sayakpaul in #12152
  • [core] Handle progress bar and logging in distributed environments by @sayakpaul in #12806
  • Improve docstrings and type hints in scheduling_consistency_models.py by @delmalih in #12931
  • [Feature] MultiControlNet support for SD3Impainting by @ishan-modi in #11251
  • Laplace Scheduler for DDPM by @gapatron in #11320
  • Store vae.config.scaling_factor to prevent missing attr reference (sdxl advanced dreambooth training script) by @Teriks in #12346
  • Add thread-safe wrappers for components in pipeline (examples/server-async/utils/requestscopedpipeline.py) by @FredyRivera-dev in #12515
  • [Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL by @kashif in #11573
  • Change timestep device to cpu for xla by @bhavya01 in #11501
  • [LoRA] add lora_alpha to sana README by @linoytsaban in #11780
  • Fix wrong param types, docs, and handles noise=None in scale_noise of FlowMatching schedulers by @Promisery in #11669
  • [docs] Remote inference by @stevhliu in #12372
  • Align HunyuanVideoConditionEmbedding with CombinedTimestepGuidanceTextProjEmbeddings by @samutamm in #12316
  • [Fix] syntax in QwenImageEditPlusPipeline by @SahilCarterr in #12371
  • Fix ftfy name error in Wan pipeline by @dsocek in #12314
  • [modular] error early in enable_auto_cpu_offload by @sayakpaul in #12578
  • [ChronoEdit] support multiple loras by @zhangjiewu in #12679
  • fix how is_fsdp is determined by @sayakpaul in #12960
  • [LoRA] add LoRA support to LTX-2 by @sayakpaul in #12933
  • Fix: typo in autoencoder_dc.py by @tvelovraf in #12687
  • [Modular] better docstring by @yiyixuxu in #12932
  • [docs] polish caching docs. by @sayakpaul in #12684
  • Fix typos by @omahs in #12705
  • Fix link to diffedit implementation reference by @JuanFKurucz in #12708
  • Fix QwenImage txt_seq_lens handling by @kashif in #12702
  • Bugfix for flux2 img2img2 prediction by @leisuzz in #12855
  • Add Flag to PeftLoraLoaderMixinTests to Enable/Disable Text Encoder LoRA Tests by @dg845 in #12962
  • Add Unified Sequence Parallel attention by @Bissmella in #12693
  • [Modular] Changes for using WAN I2V by @asomoza in #12959
  • Z rz rz rz rz rz rz r cogview by @sayakpaul in #12973
  • Update distributed_inference.md to reposition sections by @sayakpaul in #12971
  • [chore] make transformers version check stricter for glm image. by @sayakpaul in #12974
  • Remove 8bit device restriction by @SunMarc in #12972
  • disable_mmap in pipeline from_pretrained by @hlky in #12854
  • [Modular] mellon utils by @yiyixuxu in #12978
  • LongCat Image pipeline: Allow offloading/quantization of text_encoder component by @Yahweasel in #12963
  • Add ChromaInpaintPipeline by @hameerabbasi in #12848
  • fix Qwen-Image series context parallel by @DefTruth in #12970
  • Flux2 klein by @yiyixuxu in #12982
  • [modular] fix a bug in mellon param & improve docstrings by @yiyixuxu in #12980
  • add klein docs. by @sayakpaul in #12984
  • LTX 2 Single File Support by @dg845 in #12983
  • [core] gracefully error out when attn-backend x cp combo isn't supported. by @sayakpaul in #12832
  • Improve docstrings and type hints in scheduling_cosine_dpmsolver_multistep.py by @delmalih in #12936
  • [Docs] Replace root CONTRIBUTING.md with symlink to source docs by @delmalih in #12986
  • make style && make quality by @sayakpaul (direct commit on v0.37.0-release)
  • Revert "make style && make quality" by @sayakpaul (direct commit on v0.37.0-release)
  • [chore] make style to push new changes. by @sayakpaul in #12998
  • Fibo edit pipeline by @galbria in #12930
  • Fix variable name in docstring for PeftAdapterMixin.set_adapters by @geekuillaume in #13003
  • Improve docstrings and type hints in scheduling_ddim_cogvideox.py by @delmalih in #12992
  • [scheduler] Support custom sigmas in UniPCMultistepScheduler by @a-r-r-o-w in #12109
  • feat: accelerate longcat-image with regional compile by @lgyStoic in #13019
  • Improve docstrings and type hints in scheduling_ddim_flax.py by @delmalih in #13010
  • Improve docstrings and type hints in scheduling_ddim_inverse.py by @delmalih in #13020
  • fix Dockerfiles for cuda and xformers. by @sayakpaul in #13022
  • Resnet only use contiguous in training mode. by @jiqing-feng in #12977
  • feat: add qkv projection fuse for longcat transformers by @lgyStoic in #13021
  • Improve docstrings and type hints in scheduling_ddim_parallel.py by @delmalih in #13023
  • Improve docstrings and type hints in scheduling_ddpm_flax.py by @delmalih in #13024
  • Improve docstrings and type hints in scheduling_ddpm_parallel.py by @delmalih in #13027
  • Remove *pooled_* mentions from Chroma inpaint by @hameerabbasi in #13026
  • Flag Flax schedulers as deprecated by @delmalih in #13031
  • [modular] add auto_docstring & more doc related refactors by @yiyixuxu in #12958
  • Upgrade GitHub Actions to latest versions by @salmanmkc in #12866
  • [From Single File] support from_single_file method for WanAnimateTransformer3DModel by @samadwar in #12691
  • Fix: Cosmos2.5 Video2World frame extraction and add default negative prompt by @adi776borate in #13018
  • [GLM-Image] Add batch support for GlmImagePipeline by @JaredforReal in #13007
  • [Qwen] avoid creating attention masks when there is no padding by @kashif in #12987
  • [modular]support klein by @yiyixuxu in #13002
  • [QwenImage] fix prompt isolation tests by @sayakpaul in #13042
  • fast tok update by @itazap in #13036
  • change to CUDA 12.9. by @sayakpaul in #13045
  • remove torchao autoquant from diffusers docs by @vkuzo in #13048
  • docs: improve docstring scheduling_dpm_cogvideox.py by @delmalih in #13044
  • Fix Wan/WanI2V patchification by @Jayce-Ping in #13038
  • LTX2 distilled checkpoint support by @rootonchair in #12934
  • [wan] fix layerwise upcasting tests on CPU by @sayakpaul in #13039
  • [ci] uniform run times and wheels for pytorch cuda. by @sayakpaul in #13047
  • docs: fix grammar in fp16_safetensors CLI warning by @Olexandr88 in #13040
  • [wan] fix wan 2.2 when either of the transformers isn't present. by @sayakpaul in #13055
  • [bug fix] GLM-Image fit new get_image_features API by @JaredforReal in #13052
  • Fix aiter availability check by @lauri9 in #13059
  • [Modular]add a real quick start guide by @yiyixuxu in #13029
  • feat: support Ulysses Anything Attention by @DefTruth in #12996
  • Refactor Model Tests by @DN6 in #12822
  • [Flux2] Fix LoRA loading for Flux2 Klein by adaptively enumerating transformer blocks by @songkey in #13030
  • [Modular] loader related by @yiyixuxu in #13025
  • [Modular] mellon doc etc by @yiyixuxu in #13051
  • [modular] change the template modular pipeline card by @sayakpaul in #13072
  • Add support for Magcache by @AlanPonnachan in #12744
  • [docs] Fix syntax error in quantization configuration by @sayakpaul in #13076
  • docs: improve docstring scheduling_dpmsolver_multistep_inverse.py by @delmalih in #13083
  • [core] make flux hidden states contiguous by @sayakpaul in #13068
  • [core] make qwen hidden states contiguous to make torchao happy. by @sayakpaul in #13081
  • Feature/zimage inpaint pipeline by @CalamitousFelicitousness in #13006
  • GGUF fix for unquantized types when using unquantize kernels by @dxqb in #12498
  • docs: improve docstring scheduling_dpmsolver_multistep_inverse.py by @delmalih in #13085
  • [modular]simplify components manager doc by @yiyixuxu in #13088
  • ZImageControlNet cfg by @hlky in #13080
  • [Modular] refactor Wan: modular pipelines by task etc by @yiyixuxu in #13063
  • [Modular] guard ModularPipeline.blocks attribute by @yiyixuxu in #13014
  • LTX 2 Improve encode_video by Accepting More Input Types by @dg845 in #13057
  • Z image lora training by @linoytsaban in #13056
  • [modular] add modular tests for Z-Image and Wan by @sayakpaul in #13078
  • [Docs] Add guide for AutoModel with custom code by @DN6 in #13099
  • [SkyReelsV2] Fix ftfy import by @asomoza in #13113
  • [lora] fix non-diffusers lora key handling for flux2 by @sayakpaul in #13119
  • [CI] Refactor Wan Model Tests by @DN6 in #13082
  • docs: improve docstring scheduling_edm_dpmsolver_multistep.py by @delmalih in #13122
  • [Fix]Allow prompt and prior_token_ids to be provided simultaneously in GlmImagePipeline by @JaredforReal in #13092
  • docs: improve docstring scheduling_flow_match_euler_discrete.py by @delmalih in #13127
  • Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} by @miguelmartin75 in #13066
  • [modular] add tests for robust model loading. by @sayakpaul in #13120
  • Fix LTX-2 Inference when num_videos_per_prompt > 1 and CFG is Enabled by @dg845 in #13121
  • [CI] Fix setuptools pkg_resources Errors by @dg845 in #13129
  • docs: improve docstring scheduling_flow_match_heun_discrete.py by @delmalih in #13130
  • [CI] Fix setuptools pkg_resources Bug for PR GPU Tests by @dg845 in #13132
  • fix cosmos transformer typing. by @sayakpaul in #13134
  • Sunset Python 3.8 & get rid of explicit typing exports where possible by @sayakpaul in #12524
  • feat: implement apply_lora_scale to remove boilerplate. by @sayakpaul in #12994
  • [docs] fix ltx2 i2v docstring. by @sayakpaul in #13135
  • [Modular] add different pipeine blocks to init by @yiyixuxu in #13145
  • fix MT5Tokenizer by @yiyixuxu in #13146
  • fix guider by @yiyixuxu in #13147
  • [Modular] update doc for ModularPipeline by @yiyixuxu in #13100
  • [Modular] add explicit workflow support by @yiyixuxu in #13028
  • [LTX2] Fix wrong lora mixin by @asomoza in #13144
  • [Pipelines] Remove k-diffusion by @DN6 in #13152
  • [tests] accept recompile_limit from the user in tests by @sayakpaul in #13150
  • [core] support device type device_maps to work with offloading. by @sayakpaul in #12811
  • [Bug] Fix QwenImageEditPlus Series on NPU by @zhangtao0408 in #13017
  • [CI] Add ftfy as a test dependency by @DN6 in #13155
  • docs: improve docstring scheduling_flow_match_lcm.py by @delmalih in #13160
  • [docs] add docs for qwenimagelayered by @stevhliu in #13158
  • Flux2: Tensor tuples can cause issues for checkpointing by @dxqb in #12777
  • [CI] Revert setuptools CI Fix as the Failing Pipelines are Deprecated by @dg845 in #13149
  • Fix ftfy import for PRX Pipeline by @dg845 in #13154
  • [core] Enable CP for kernels-based attention backends by @sayakpaul in #12812
  • remove deps related to test from ci by @sayakpaul in #13164
  • [CI] Fix new LoRAHotswap tests by @DN6 in #13163
  • [gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor by @anijain2305 in #13166
  • Support Flux Klein peft (fal) lora format by @asomoza in #13169
  • Fix T5GemmaEncoder loading for transformers 5.x composite T5GemmaConfig by @DavidBert in #13143
  • Allow Automodel to use from_config with custom code. by @DN6 in #13123
  • Fix AutoModel typing Import Error by @dg845 in #13178
  • migrate to transformers v5 by @sayakpaul in #12976
  • fix: graceful fallback when attention backends fail to import by @sym-bot in #13060
  • [docs] Fix torchrun command argument order in docs by @sayakpaul in #13181
  • [attention backends] use dedicated wrappers from fa3 for cp. by @sayakpaul in #13165
  • Cosmos Transfer2.5 Auto-Regressive Inference Pipeline by @miguelmartin75 in #13114
  • Fix wrong do_classifier_free_guidance threshold in ZImagePipeline by @kirillsst in #13183
  • Fix Flash Attention 3 interface for new FA3 return format by @veeceey in #13173
  • Fix LTX-2 image-to-video generation failure in two stages generation by @Songrui625 in #13187
  • Fixing Kohya loras loading: Flux.1-dev loras with TE ("lora_te1_" prefix) by @christopher5106 in #13188
  • [Modular] update the auto pipeline blocks doc by @yiyixuxu in #13148
  • [tests] consistency tests for modular index by @sayakpaul in #13192
  • [modular] fallback to default_blocks_name when loading base block classes in ModularPipeline by @yiyixuxu in #13193
  • [chore] updates in the pypi publication workflow. by @sayakpaul in #12805
  • [tests] enable cpu offload test in torchao without compilation. by @sayakpaul in #12704
  • remove db utils from benchmarking by @sayakpaul in #13199
  • [AutoModel] Fix bug with subfolders and local model paths when loading custom code by @DN6 in #13197
  • [AutoModel] Allow registering auto_map to model config by @DN6 in #13186
  • [Modular] Save Modular Pipeline weights to Hub by @DN6 in #13168
  • docs: improve docstring scheduling_ipndm.py by @delmalih in #13198
  • Clean up accidental files by @DN6 in #13202
  • [modular]Update model card to include workflow by @yiyixuxu in #13195
  • [modular] not pass trust_remote_code to external repos by @yiyixuxu in #13204
  • [Modular] implement requirements validation for custom blocks by @sayakpaul in #12196
  • cogvideo example: Distribute VAE video encoding across processes in CogVideoX LoRA training by @jiqing-feng in #13207
  • Fix group-offloading bug by @SHYuanBest in #13211
  • Add Helios-14B Video Generation Pipelines by @dg845 in #13208
  • [Z-Image] Fix more do_classifier_free_guidance thresholds by @asomoza in #13212
  • [lora] fix zimage lora conversion to support for more lora. by @sayakpaul in #13209
  • adding lora support to z-image controlnet pipelines by @christopher5106 in #13200
  • Add LTX2 Condition Pipeline by @dg845 in #13058
  • Fix Helios paper link in documentation by @SHYuanBest in #13213
  • [attention backends] change to updated repo and version. by @sayakpaul in #13161
  • feat: implement rae autoencoder. by @Ando233 in #13046
  • Release: v0.37.0-release by @sayakpaul (direct commit on v0.37.0-release)

Significant community contributions

The following contributors have made significant changes to the library over the last release:

  • @delmalih
    • Improve docstrings and type hints in scheduling_dpmsolver_singlestep.py (#12798)
    • Improve docstrings and type hints in scheduling_edm_euler.py (#12871)
    • Improve docstrings and type hints in scheduling_consistency_decoder.py (#12928)
    • Improve docstrings and type hints in scheduling_consistency_models.py (#12931)
    • Improve docstrings and type hints in scheduling_cosine_dpmsolver_multistep.py (#12936)
    • [Docs] Replace root CONTRIBUTING.md with symlink to source docs (#12986)
    • Improve docstrings and type hints in scheduling_ddim_cogvideox.py (#12992)
    • Improve docstrings and type hints in scheduling_ddim_flax.py (#13010)
    • Improve docstrings and type hints in scheduling_ddim_inverse.py (#13020)
    • Improve docstrings and type hints in scheduling_ddim_parallel.py (#13023)
    • Improve docstrings and type hints in scheduling_ddpm_flax.py (#13024)
    • Improve docstrings and type hints in scheduling_ddpm_parallel.py (#13027)
    • Flag Flax schedulers as deprecated (#13031)
    • docs: improve docstring scheduling_dpm_cogvideox.py (#13044)
    • docs: improve docstring scheduling_dpmsolver_multistep_inverse.py (#13083)
    • docs: improve docstring scheduling_dpmsolver_multistep_inverse.py (#13085)
    • docs: improve docstring scheduling_edm_dpmsolver_multistep.py (#13122)
    • docs: improve docstring scheduling_flow_match_euler_discrete.py (#13127)
    • docs: improve docstring scheduling_flow_match_heun_discrete.py (#13130)
    • docs: improve docstring scheduling_flow_match_lcm.py (#13160)
    • docs: improve docstring scheduling_ipndm.py (#13198)
  • @yiyixuxu
    • [Modular]z-image (#12808)
    • more update in modular (#12560)
    • [Modular] qwen refactor (#12872)
    • [Modular] better docstring (#12932)
    • [Modular] mellon utils (#12978)
    • Flux2 klein (#12982)
    • [modular] fix a bug in mellon param & improve docstrings (#12980)
    • [modular] add auto_docstring & more doc related refactors (#12958)
    • [modular]support klein (#13002)
    • [Modular]add a real quick start guide (#13029)
    • [Modular] loader related (#13025)
    • [Modular] mellon doc etc (#13051)
    • [modular]simplify components manager doc (#13088)
    • [Modular] refactor Wan: modular pipelines by task etc (#13063)
    • [Modular] guard ModularPipeline.blocks attribute (#13014)
    • [Modular] add different pipeine blocks to init (#13145)
    • fix MT5Tokenizer (#13146)
    • fix guider (#13147)
    • [Modular] update doc for ModularPipeline (#13100)
    • [Modular] add explicit workflow support (#13028)
    • [Modular] update the auto pipeline blocks doc (#13148)
    • [modular] fallback to default_blocks_name when loading base block classes in ModularPipeline (#13193)
    • [modular]Update model card to include workflow (#13195)
    • [modular] not pass trust_remote_code to external repos (#13204)
  • @sayakpaul
    • Fix Qwen Edit Plus modular for multi-image input (#12601)
    • [docs] improve distributed inference cp docs. (#12810)
    • post release 0.36.0 (#12804)
    • Update distributed_inference.md to correct syntax (#12827)
    • [lora] Remove lora docs unneeded and add " # Copied from ..." (#12824)
    • fix the use of device_map in CP docs (#12902)
    • [core] remove unneeded autoencoder methods when subclassing from AutoencoderMixin (#12873)
    • [docs] fix torchao typo. (#12883)
    • Update wan.md to remove unneeded hfoptions (#12890)
    • [modular] Tests for custom blocks in modular diffusers (#12557)
    • [chore] remove controlnet implementations outside controlnet module. (#12152)
    • [core] Handle progress bar and logging in distributed environments (#12806)
    • [modular] error early in enable_auto_cpu_offload (#12578)
    • fix how is_fsdp is determined (#12960)
    • [LoRA] add LoRA support to LTX-2 (#12933)
    • [docs] polish caching docs. (#12684)
    • Z rz rz rz rz rz rz r cogview (#12973)
    • Update distributed_inference.md to reposition sections (#12971)
    • [chore] make transformers version check stricter for glm image. (#12974)
    • add klein docs. (#12984)
    • [core] gracefully error out when attn-backend x cp combo isn't supported. (#12832)
    • make style && make quality
    • Revert "make style && make quality"
    • [chore] make style to push new changes. (#12998)
    • fix Dockerfiles for cuda and xformers. (#13022)
    • [QwenImage] fix prompt isolation tests (#13042)
    • change to CUDA 12.9. (#13045)
    • [wan] fix layerwise upcasting tests on CPU (#13039)
    • [ci] uniform run times and wheels for pytorch cuda. (#13047)
    • [wan] fix wan 2.2 when either of the transformers isn't present. (#13055)
    • [modular] change the template modular pipeline card (#13072)
    • [docs] Fix syntax error in quantization configuration (#13076)
    • [core] make flux hidden states contiguous (#13068)
    • [core] make qwen hidden states contiguous to make torchao happy. (#13081)
    • [modular] add modular tests for Z-Image and Wan (#13078)
    • [lora] fix non-diffusers lora key handling for flux2 (#13119)
    • [modular] add tests for robust model loading. (#13120)
    • fix cosmos transformer typing. (#13134)
    • Sunset Python 3.8 & get rid of explicit typing exports where possible (#12524)
    • feat: implement apply_lora_scale to remove boilerplate. (#12994)
    • [docs] fix ltx2 i2v docstring. (#13135)
    • [tests] accept recompile_limit from the user in tests (#13150)
    • [core] support device type device_maps to work with offloading. (#12811)
    • [core] Enable CP for kernels-based attention backends (#12812)
    • remove deps related to test from ci (#13164)
    • migrate to transformers v5 (#12976)
    • [docs] Fix torchrun command argument order in docs (#13181)
    • [attention backends] use dedicated wrappers from fa3 for cp. (#13165)
    • [tests] consistency tests for modular index (#13192)
    • [chore] updates in the pypi publication workflow. (#12805)
    • [tests] enable cpu offload test in torchao without compilation. (#12704)
    • remove db utils from benchmarking (#13199)
    • [Modular] implement requirements validation for custom blocks (#12196)
    • [lora] fix zimage lora conversion to support for more lora. (#13209)
    • [attention backends] change to updated repo and version. (#13161)
    • Release: v0.37.0-release
  • @DN6
    • [WIP] Add Flux2 modular (#12763)
    • Refactor Model Tests (#12822)
    • [Docs] Add guide for AutoModel with custom code (#13099)
    • [CI] Refactor Wan Model Tests (#13082)
    • [Pipelines] Remove k-diffusion (#13152)
    • [CI] Add ftfy as a test dependency (#13155)
    • [CI] Fix new LoRAHotswap tests (#13163)
    • Allow Automodel to use from_config with custom code. (#13123)
    • [AutoModel] Fix bug with subfolders and local model paths when loading custom code (#13197)
    • [AutoModel] Allow registering auto_map to model config (#13186)
    • [Modular] Save Modular Pipeline weights to Hub (#13168)
    • Clean up accidental files (#13202)
  • @naykun
    • [qwen-image] edit 2511 support (#12839)
    • Qwen Image Layered Support (#12853)
  • @junqiangwu
    • Add support for LongCat-Image (#12828)
    • fix the prefix_token_len bug (#12845)
  • @hlky
    • Z-Image-Turbo ControlNet (#12792)
    • Z-Image-Turbo from_single_file fix (#12888)
    • Detect 2.0 vs 2.1 ZImageControlNetModel (#12861)
    • disable_mmap in pipeline from_pretrained (#12854)
    • ZImageControlNet cfg (#13080)
  • @miguelmartin75
    • Cosmos Predict2.5 Base: inference pipeline, scheduler & chkpt conversion (#12852)
    • Cosmos Predict2.5 14b Conversion (#12863)
    • Fix typo in src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_predict.py (#12914)
    • Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} (#13066)
    • Cosmos Transfer2.5 Auto-Regressive Inference Pipeline (#13114)
  • @RuoyiDu
    • Add z-image-omni-base implementation (#12857)
  • @r4inm4ker
    • Community Pipeline: Add z-image differential img2img (#12882)
  • @yaoqih
    • LTX Video 0.9.8 long multi prompt (#12614)
  • @dg845
    • Add LTX 2.0 Video Pipelines (#12915)
    • Add Flag to PeftLoraLoaderMixinTests to Enable/Disable Text Encoder LoRA Tests (#12962)
    • LTX 2 Single File Support (#12983)
    • LTX 2 Improve encode_video by Accepting More Input Types (#13057)
    • Fix LTX-2 Inference when num_videos_per_prompt > 1 and CFG is Enabled (#13121)
    • [CI] Fix setuptools pkg_resources Errors (#13129)
    • [CI] Fix setuptools pkg_resources Bug for PR GPU Tests (#13132)
    • [CI] Revert setuptools CI Fix as the Failing Pipelines are Deprecated (#13149)
    • Fix ftfy import for PRX Pipeline (#13154)
    • Fix AutoModel typing Import Error (#13178)
    • Add Helios-14B Video Generation Pipelines (#13208)
    • Add LTX2 Condition Pipeline (#13058)
  • @kashif
    • [Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL (#11573)
    • Fix QwenImage txt_seq_lens handling (#12702)
    • [Qwen] avoid creating attention masks when there is no padding (#12987)
  • @bhavya01
    • Change timestep device to cpu for xla (#11501)
  • @linoytsaban
    • [LoRA] add lora_alpha to sana README (#11780)
    • Z image lora training (#13056)
  • @stevhliu
    • [docs] Remote inference (#12372)
    • [docs] add docs for qwenimagelayered (#13158)
  • @hameerabbasi
    • Add ChromaInpaintPipeline (#12848)
    • Remove *pooled_* mentions from Chroma inpaint (#13026)
  • @galbria
  • @JaredforReal
    • [GLM-Image] Add batch support for GlmImagePipeline (#13007)
    • [bug fix] GLM-Image fit new get_image_features API (#13052)
    • [Fix]Allow prompt and prior_token_ids to be provided simultaneously in GlmImagePipeline (#13092)
  • @rootonchair
    • LTX2 distilled checkpoint support (#12934)
  • @AlanPonnachan
    • Add support for Magcache (#12744)
  • @CalamitousFelicitousness
    • Feature/zimage inpaint pipeline (#13006)
  • @Ando233
    • feat: implement rae autoencoder. (#13046)

Don't miss a new diffusers release

NewReleases is sending notifications on new releases.