🪄 Karlo UnCLIP by Kakao Brain
Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details in a small number of denoising steps.
This alpha version of Karlo is trained on 115M image-text pairs, including COYO-100M high-quality subset, CC3M, and CC12M.
For more information about the architecture, see the Karlo repository: https://github.com/kakaobrain/karlo
pip install diffusers transformers safetensors accelerate
import torch
from diffusers import UnCLIPPipeline
pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "a high-resolution photograph of a big red frog on a green leaf."
image = pipe(prompt).images[0]
Community pipeline versioning
The community pipelines hosted in diffusers/examples/community
will now follow the installed version of the library.
E.g. if you have diffusers==0.9.0
installed, the pipelines from the v0.9.0
branch will be used: https://github.com/huggingface/diffusers/tree/v0.9.0/examples/community
If you've installed diffusers from source, e.g. with pip install git+https://github.com/huggingface/diffusers
then the latest versions of the pipelines will be fetched from the main
branch.
To change the custom pipeline version, set the custom_revision
variable like so:
pipeline = DiffusionPipeline.from_pretrained(
"google/ddpm-cifar10-32", custom_pipeline="one_step_unet", custom_revision="0.10.2"
)
🦺 safetensors
Many of the most important checkpoints now have https://github.com/huggingface/safetensors available. Upon installing safetensors
with:
pip install safetensors
You will see a nice speed-up when loading your model 🚀
Some of the most improtant checkpoints have safetensor weights added now:
- https://huggingface.co/stabilityai/stable-diffusion-2
- https://huggingface.co/stabilityai/stable-diffusion-2-1
- https://huggingface.co/stabilityai/stable-diffusion-2-depth
- https://huggingface.co/stabilityai/stable-diffusion-2-inpainting
Batched generation bug fixes 🐛
- Make sure all pipelines can run with batched input by @patrickvonplaten in #1669
We fixed a lot of bugs for batched generation. All pipelines should now correctly process batches of prompts and images 🤗
Also we made it much easier to tweak images with reproducible seeds:
https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds
📝 Changelog
- Remove spurious arg in training scripts by @pcuenca in #1644
- dreambooth: fix #1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16 by @timh in #1618
- Allow k pipeline to generate > 1 images by @pcuenca in #1645
- Remove unnecessary offset in img2img by @patrickvonplaten in #1653
- Remove unnecessary kwargs in depth2img by @maruel in #1648
- Add text encoder conversion by @lawfordp2017 in #1559
- VersatileDiffusion: fix input processing by @LukasStruppek in #1568
- tensor format ort bug fix by @prathikr in #1557
- Deprecate init image correctly by @patrickvonplaten in #1649
- fix bug if we don't do_classifier_free_guidance by @MKFMIKU in #1601
- Handle missing global_step key in scripts/convert_original_stable_diffusion_to_diffusers.py by @Cyberes in #1612
- [SD] Make sure scheduler is correct when converting by @patrickvonplaten in #1667
- [Textual Inversion] Do not update other embeddings by @patrickvonplaten in #1665
- Added Community pipeline for comparing Stable Diffusion v1.1-4 checkpoints by @suvadityamuk in #1584
- Fix wrong type checking in
convert_diffusers_to_original_stable_diffusion.py
by @apolinario in #1681 - [Version] Bump to 0.11.0.dev0 by @patrickvonplaten in #1682
- Dreambooth: save / restore training state by @pcuenca in #1668
- Disable telemetry when DISABLE_TELEMETRY is set by @w4ffl35 in #1686
- Change one-step dummy pipeline for testing by @patrickvonplaten in #1690
- [Community pipeline] Add github mechanism by @patrickvonplaten in #1680
- Dreambooth: use warnings instead of logger in parse_args() by @pcuenca in #1688
- manually update train_unconditional_ort by @prathikr in #1694
- Remove all local telemetry by @anton-l in #1702
- Update main docs by @patrickvonplaten in #1706
- [Readme] Clarify package owners by @anton-l in #1707
- Fix the bug that torch version less than 1.12 throws TypeError by @chinoll in #1671
- RePaint fast tests and API conforming by @anton-l in #1701
- Add state checkpointing to other training scripts by @pcuenca in #1687
- Improve pipeline_stable_diffusion_inpaint_legacy.py by @cyber-meow in #1585
- apply amp bf16 on textual inversion by @jiqing-feng in #1465
- Add examples with Intel optimizations by @hshen14 in #1579
- Added a README page for docs and a "schedulers" page by @yiyixuxu in #1710
- Accept latents as optional input in Latent Diffusion pipeline by @daspartho in #1723
- Fix ONNX img2img preprocessing and add fast tests coverage by @anton-l in #1727
- Fix ldm tests on master by not running the CPU tests on GPU by @patrickvonplaten in #1729
- Docs: recommend xformers by @pcuenca in #1724
- Nightly integration tests by @anton-l in #1664
- [Batched Generators] This PR adds generators that are useful to make batched generation fully reproducible by @patrickvonplaten in #1718
- Fix ONNX img2img preprocessing by @peterto in #1736
- Fix MPS fast test warnings by @anton-l in #1744
- Fix/update the LDM pipeline and tests by @anton-l in #1743
- kakaobrain unCLIP by @williamberman in #1428
- [fix] pipeline_unclip generator by @williamberman in #1751
- unCLIP docs by @williamberman in #1754
- Correct help text for scheduler_type flag in scripts. by @msiedlarek in #1749
- Add resnet_time_scale_shift to VD layers by @anton-l in #1757
- Add attention mask to uclip by @patrickvonplaten in #1756
- Support attn2==None for xformers by @anton-l in #1759
- [UnCLIPPipeline] fix num_images_per_prompt by @patil-suraj in #1762
- Add CPU offloading to UnCLIP by @anton-l in #1761
- [Versatile] fix attention mask by @patrickvonplaten in #1763
- [Revision] Don't recommend using revision by @patrickvonplaten in #1764
- [Examples] Update train_unconditional.py to include logging argument for Wandb by @ash0ts in #1719
- Transformers version req for UnCLIP by @anton-l in #1766