accelerate 0.31.0 on Python PyPI

Core

Set timeout default to PyTorch defaults based on backend by @muellerzr in #2758
fix duplicate elements in split_between_processes by @hkunzhe in #2781
Add Elastic Launch Support to notebook_launcher by @yhna940 in #2788
Fix Wrong use of sync_gradients used to implement sync_each_batch by @fabianlim in #2790

FSDP

Introduce shard-merging util for FSDP by @muellerzr in #2772
Enable sharded state dict + offload to cpu resume by @muellerzr in #2762
Enable config for fsdp activation checkpointing by @helloworld1 in #2779

Megatron

Upgrade huggingface's megatron to nvidia's megatron when use MegatronLMPlugin by @zhangsheng377 in #2501

What's Changed

Add feature to allow redirecting std streams into log files when using torchrun as the launcher. by @lyuwen in #2740
Update modeling.py by adding try-catch section to skip the unavailable devices by @MeVeryHandsome in #2681
Fixed the problem of incorrect conditional judgment statement when configuring enable_cpu_affinity by @statelesshz in #2748
Fix stacklevel in logging to log the actual user call site (instead of the call site inside the logger wrapper) of log functions by @luowyang in #2730
LOMO / FIX: Support multiple optimizers by @younesbelkada in #2745
Fix max_memory assignment by @SunMarc in #2751
Fix duplicate environment variable check in multi-cpu condition by @yhna940 in #2752
Simplify CLI args validation and ensure CLI args take precedence over config file. by @Iain-S in #2757
Fix sagemaker config by @muellerzr in #2753
fix cpu omp num threads set by @jiqing-feng in #2755
Revert "Simplify CLI args validation and ensure CLI args take precedence over config file." by @muellerzr in #2763
Enable sharded cpu resume by @muellerzr in #2762
Sets default to PyTorch defaults based on backend by @muellerzr in #2758
optimize get_module_leaves speed by @BBuf in #2756
fix minor typo by @TemryL in #2767
Fix small edge case in get_module_leaves by @SunMarc in #2774
Skip deepspeed test by @SunMarc in #2776
Enable config for fsdp activation checkpointing by @helloworld1 in #2779
Add arg from CLI to fix failing test by @muellerzr in #2783
Skip tied weights disk offload test by @SunMarc in #2782
Introduce shard-merging util for FSDP by @muellerzr in #2772
FIX / FSDP : Guard fsdp utils for earlier PyTorch versions by @younesbelkada in #2794
Upgrade huggingface's megatron to nvidia's megatron when use MegatronLMPlugin by @zhangsheng377 in #2501
Fixup CLI test by @muellerzr in #2796
fix duplicate elements in split_between_processes by @hkunzhe in #2781
Add Elastic Launch Support to notebook_launcher by @yhna940 in #2788
Fix Wrong use of sync_gradients used to implement sync_each_batch by @fabianlim in #2790
Fix type in accelerator.py by @qgallouedec in #2800
fix comet ml test by @SunMarc in #2804
New template by @muellerzr in #2808
Fix access error for torch.mps when using torch==1.13.1 on macOS by @SunMarc in #2806
4-bit quantization meta device bias loading bug by @SunMarc in #2805
State dictionary retrieval from offloaded modules by @blbadger in #2619
add cuda dep for a test by @SunMarc in #2820
Remove out-dated xpu device check code in get_balanced_memory by @faaany in #2826
Fix DeepSpeed config validation error by changing stage3_prefetch_bucket_size value to an integer by @adk9 in #2814
Improve test speeds by up to 30% in multi-gpu settings by @muellerzr in #2830
monitor-interval, take 2 by @muellerzr in #2833
Optimize the megatron plugin by @zhangsheng377 in #2822
fix fstr format by @Jintao-Huang in #2810

New Contributors

@lyuwen made their first contribution in #2740
@MeVeryHandsome made their first contribution in #2681
@luowyang made their first contribution in #2730
@Iain-S made their first contribution in #2757
@BBuf made their first contribution in #2756
@TemryL made their first contribution in #2767
@helloworld1 made their first contribution in #2779
@hkunzhe made their first contribution in #2781
@adk9 made their first contribution in #2814
@Jintao-Huang made their first contribution in #2810

Full Changelog: v0.30.1...v0.31.0

accelerate 0.31.0 v0.31.0: Better support for sharded state dict with FSDP and Bugfixes on Python PyPI

Core

FSDP

Megatron

What's Changed

New Contributors

accelerate 0.31.0
v0.31.0: Better support for sharded state dict with FSDP and Bugfixes

on Python PyPI