github huggingface/accelerate v1.9.0
v1.9.0: Trackio support, Model loading speedup, Minor distributed improvements

latest releases: v1.12.0, v1.11.0, v1.10.1...
7 months ago

Trackio tracker support

We've added support for a trackio, lightweight, πŸ’― free experiment tracking Python library built on top of πŸ€— Datasets and Spaces.

Main features are:

  • Local-first design: dashboard runs locally by default. You can also host it on Spaces by specifying a space_id.
  • Persists logs locally (or in a private Hugging Face Dataset)
  • Visualize experiments with a Gradio dashboard locally (or on Hugging Face Spaces)
  • Everything here, including hosting on Hugging Faces, is free!

To use it with accelerate, you need to set log_with and initialize the trackers

accelerator = Accelerator(log_with="trackio")
config={"learning_rate": 0.001, "batch_size": 32}
# init_kwargs in order to host the dashboard on spaces
init_kwargs = {"trackio": {"space_id": "hf_username/space_name"}
accelerator.init_trackers("example_project", config=config, init_kwargs=init_kwargs})

Thanks @pcuenca for the integration !

Model loading speedup when relying set_module_tensor_to_device

Setting tensor while clearing cache is very slow, so we added clear_device option to disable it.
Another small optimization is using non_blocking everywhere and syncing just before returning control to the user. This makes the loading slightly faster.

  • Speedup model loading by 4-5x in Diffusers ⚑ by @a-r-r-o-w in #3674

FDSP, Deepspeed, FP8 minor improvements

  • Add support for e5e2 and default to hybrid when launcher is used by @IlyasMoutawwakil in #3640
  • Fix FP8 tests, enable FP8 to be used without direct Accelerator() configuring by @pstjohn in #3677
  • Bunch of FSDP improvements by @S1ro1 in #3671
  • Fix: properly error when DDP + Dtensor model by @S1ro1 in #3629
  • Fix fsdp2 example typo by @shimizust in #3657
  • Added a check in no_sync() to avoid errors when using deepspeed zero2/3 by @xliu0105 in #3656

🚨🚨🚨 Breaking changes 🚨🚨🚨

find_executable_batch_size() will no longer halves the batch after every OOM. Instead, we will multiply the batch size by 0.9. This should help user not waste gpu capacity.

  • β€œStop Halving My Batch!” Β· Default back-off 0.5 β†’ 0.9 by @SunMarc in #3684

What's Changed

New Contributors

Full Changelog: v1.8.1...v1.9.0

Don't miss a new accelerate release

NewReleases is sending notifications on new releases.