github huggingface/diffusers v0.21.0
v0.21.0: Würstchen, Faster LoRA loading, Faster imports, T2I Adapters for SDXL, and more

latest releases: v0.31.0, v0.30.3, v0.30.2...
14 months ago

Würstchen

Würstchen is a diffusion model, whose text-conditional model works in a highly compressed latent space of images, allowing cheaper and faster inference.

Here is how to use the Würstchen as a pipeline:

import torch
from diffusers import AutoPipelineForText2Image
from diffusers.pipelines.wuerstchen import DEFAULT_STAGE_C_TIMESTEPS

pipeline = AutoPipelineForText2Image.from_pretrained("warp-ai/wuerstchen", torch_dtype=torch.float16).to("cuda")

caption = "Anthropomorphic cat dressed as a firefighter"
images = pipeline(
	caption,
	height=1024,
	width=1536,
	prior_timesteps=DEFAULT_STAGE_C_TIMESTEPS,
	prior_guidance_scale=4.0,
	num_images_per_prompt=4,
).images

To learn more about the pipeline, check out the official documentation.

This pipeline was contributed by one of the authors of Würstchen, @dome272, with help from @kashif and @patrickvonplaten.

👉 Try out the model here: https://huggingface.co/spaces/warp-ai/Wuerstchen

T2I Adapters for Stable Diffusion XL (SDXL)

T2I-Adapter is an efficient plug-and-play model that provides extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models.

In collaboration with the Tencent ARC researchers, we trained T2I Adapters on various conditions: sketch, canny, lineart, depth, and openpose.

Below is an how to use the StableDiffusionXLAdapterPipeline.

First ensure, the controlnet_aux is installed:

pip install -U controlnet_aux==0.0.7

Then we can initialize the pipeline:

import torch
from controlnet_aux.lineart import LineartDetector
from diffusers import (AutoencoderKL, EulerAncestralDiscreteScheduler,
                       StableDiffusionXLAdapterPipeline, T2IAdapter)
from diffusers.utils import load_image, make_image_grid

# load adapter
adapter = T2IAdapter.from_pretrained(
    "TencentARC/t2i-adapter-lineart-sdxl-1.0", torch_dtype=torch.float16, varient="fp16"
).to("cuda")

# load pipeline
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
euler_a = EulerAncestralDiscreteScheduler.from_pretrained(
    model_id, subfolder="scheduler"
)
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16
)
pipe = StableDiffusionXLAdapterPipeline.from_pretrained(
    model_id,
    vae=vae,
    adapter=adapter,
    scheduler=euler_a,
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")

# load lineart detector
line_detector = LineartDetector.from_pretrained("lllyasviel/Annotators").to("cuda")

We then load an image to compute the lineart conditionings:

url = "https://huggingface.co/Adapter/t2iadapter/resolve/main/figs_SDXLV1.0/org_lin.jpg"
image = load_image(url)
image = line_detector(image, detect_resolution=384, image_resolution=1024)

Then we generate:

prompt = "Ice dragon roar, 4k photo"
negative_prompt = "anime, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured"
gen_images = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    image=image,
    num_inference_steps=30,
    adapter_conditioning_scale=0.8,
    guidance_scale=7.5,
).images[0]

Refer to the official documentation to learn more about StableDiffusionXLAdapterPipeline.

This blog post summarizes our experiences and provides all the resources (including the pre-trained T2I Adapter checkpoints) to get started using T2I Adapters for SDXL.

We’re also releasing a training script for training your custom T2I Adapters on SDXL. Check out the documentation to learn more.

Thanks to @MC-E (one of the authors of T2I Adapters) for contributing the StableDiffusionXLAdapterPipeline in #4696.

Faster imports

We introduced “lazy imports” (#4829) to significantly improve the time it takes to import our modules (such as pipelines, models, and so on). Below is a comparison of the timings with and without lazy imports on import diffusers.

With lazy imports:

real    0m0.417s
user    0m0.714s
sys     0m0.499s

Without lazy imports:

real    0m5.391s
user    0m5.299s
sys     0m1.273s

Faster LoRA loading

Previously, loading LoRA parameters using the load_lora_weights() used to be time-consuming as reported in #4975. To this end, we introduced a low_cpu_mem_usage argument to the load_lora_weights() method in #4994 which should speed up the loading time significantly. Just pass low_cpu_mem_usage=True to take the benefits.

LoRA fusing

LoRA weights can now be fused into the model weights, thus allowing models that have loaded LoRA weights to run as fast as models without. It also enables to fuse multiple LoRAs into the same model.

For more information, have a look at the documentation and the original PR: #4473.

More support for LoRAs

Almost all LoRA formats out there for SDXL are now supported. For a more details, please check the documentation.

All commits

Significant community contributions

The following contributors have made significant changes to the library over the last release:

Don't miss a new diffusers release

NewReleases is sending notifications on new releases.