Core
- Accelerate can now optimize NUMA affinity, which can help increase throughput on NVIDIA multi-GPU systems. To enable it either follow the prompt during
accelerate config
, set theACCELERATE_CPU_AFFINITY=1
env variable, or manually using the following:
from accelerate.utils import set_numa_affinity
# For GPU 0
set_numa_affinity(0)
Big thanks to @stas00 for the recommendation, request, and feedback during development
- Allow for setting deterministic algorithms in
set_seed
by @muellerzr in #2569 - Fixed the test script for TPU v2/v3 by @vanbasten23 in #2542
- Cambricon MLU device support introduced by @huismiling in #2552
- A big refactor was performed to the PartialState and AcceleratorState to allow for easier future-proofing and simplification of adding new devices by @muellerzr in #2576
- Fixed a reproducibility issue in distributed environments with Dataloader shuffling when using
BatchSamplerShard
by @universuen in #2584 notebook_launcher
can use multiple GPUs in Google Colab if using a custom instance that supports multiple GPUs by @StefanTodoran in #2561
Big Model Inference
- Add log message for RTX 4000 series when performing multi-gpu inference with device_map which can lead to hanging by @SunMarc in #2557
- Fix
load_checkpoint_in_model
behavior when unexpected keys are in the checkpoint by @fxmarty in #2588
DeepSpeed
- Fix issue with the mapping of
main_process_ip
andmaster_addr
when not using standard as deepspeed launcher by @asdfry in #2495 - Improve deepspeed env gen by checking for bad keys, by @muellerzr and @ricklamers in #2565
- We now support custom deepspeed env files. Like normal
deepspeed
, set it with theDS_ENV_FILE
environmental variable by @muellerzr in #2566 - Resolve ZeRO-3 Initialization Failure in already-started distributed environments by @sword865 in #2578
What's Changed
- Fix test_script.py on TPU v2/v3 by @vanbasten23 in #2542
- Add mapping
main_process_ip
andmaster_addr
when not using standard as deepspeed launcher by @asdfry in #2495 - split_between_processes for Dataset by @geronimi73 in #2433
- Include working driver check by @muellerzr in #2558
- 🚨🚨🚨Move to using tags rather than latest for docker images and consolidate image repos 🚨 🚨🚨 by @muellerzr in #2554
- Add Cambricon MLU accelerator support by @huismiling in #2552
- Add NUMA affinity control for NVIDIA GPUs by @muellerzr in #2535
- Add log message for RTX 4000 series when performing multi-gpu inference with device_map by @SunMarc in #2557
- Improve deepspeed env gen by @muellerzr in #2565
- Allow for setting deterministic algorithms by @muellerzr in #2569
- Unpin deepspeed by @muellerzr in #2570
- Rm uv install by @muellerzr in #2577
- Allow for custom deepspeed env files by @muellerzr in #2566
- [docs] Missing functions from API by @stevhliu in #2580
- Update data_loader.py to Ensure Reproducibility in Multi-Process Environments with Dataloader Shuffle by @universuen in #2584
- Refactor affinity and make it stateful by @muellerzr in #2579
- Refactor and improve model estimator tool by @muellerzr in #2581
- Fix
load_checkpoint_in_model
behavior when unexpected keys are in the checkpoint by @fxmarty in #2588 - Guard stateful objects by @muellerzr in #2572
- Expound PartialState docstring by @muellerzr in #2589
- [docs] Fix kwarg docstring by @stevhliu in #2590
- Allow notebook_launcher to launch to multiple GPUs from Colab by @StefanTodoran in #2561
- Fix warning log for unused checkpoint keys by @fxmarty in #2594
- Resolve ZeRO-3 Initialization Failure in Pre-Set Torch Distributed Environments (huggingface/transformers#28803) by @sword865 in #2578
- Refactor PartialState and AcceleratorState by @muellerzr in #2576
- Allow for force unwrapping by @muellerzr in #2595
- Pin hub for tests by @muellerzr in #2608
- Default false for trust_remote_code by @muellerzr in #2607
- fix llama example for pippy by @SunMarc in #2616
- Fix links in Quick Tour by @muellerzr in #2617
- Link to bash in env reporting by @muellerzr in #2623
- Unpin hub by @muellerzr in #2625
New Contributors
- @asdfry made their first contribution in #2495
- @geronimi73 made their first contribution in #2433
- @huismiling made their first contribution in #2552
- @universuen made their first contribution in #2584
- @StefanTodoran made their first contribution in #2561
- @sword865 made their first contribution in #2578
Full Changelog: v0.28.0...v0.29.0