Megatron LM integration
Accelerate now supports Megatron-LM for the three model classes (BERT, GPT-2 and T5). You can learn more in the documentation.
- Megatron-LM integration by @pacman100 in #667
- ensure megatron is 2.2.0+ by @jeffra in #755
- updating docs to use fork of megatron-lm and minor example/docs fix by @pacman100 in #766
- adding support to return logits and generate for Megatron-LM GPT models by @pacman100 in #819
PyTorch 1.13 support
Fixes a bug that returned SIGKILL errors on Windows.
- Isolate distrib_run by @muellerzr in #828
Kaggle support with the notebook_launcher
With Kaggle now giving instances with two T4 GPUs, Accelerate can leverage this to do multi-gpu training from the notebook
- Work in kaggle! by @muellerzr in #783
What's new?
- Add
non_blocking
kwarg tosend_to_device()
by @NouamaneTazi in #607 - [ds launcher] un-hijack PYTHONPATH by @stas00 in #741
- Fix num_processes is not defined by @muellerzr in #746
- [Device map] nn.Parameter don't have children by @patrickvonplaten in #747
- Use HTML relative paths for tiles by @lewtun in #749
- Add gpu_ids to SageMakerConfig though it should never be set by @muellerzr in #751
- Change num_cpu_threads_per_process default by @muellerzr in #753
- Return unclipped gradient from grad_clip_norm_ by @samuelstevens in #756
- refactor by @pacman100 in #758
- update docs by @pacman100 in #759
- Only wrap modules in DDP if they require grad by @samuelstevens in #761
- Move io_same_device hook to before attach_align_device hook on cpu_offload and disk_offload. by @piEsposito in #768
- Regression cli tests by @muellerzr in #772
- Fix number of devices in get_balanced_memory by @sgugger in #774
- Fix all github actions issues + depreciations by @muellerzr in #773
- Fix flakey wandb test by @muellerzr in #775
- Add defaults for launchers by @muellerzr in #778
- Allow BatchSamplerShard to not even out batches by @sgugger in #776
- Make rich toggleable and seperate out a new environment utility file by @muellerzr in #779
- Add same_network + docs by @muellerzr in #780
- fix transformers tests by @ArthurZucker in #777
- Add Dev Container configuration by @Chris-hughes10 in #782
- separate dataloader generator from sampler generator by @pacman100 in #789
- Consider top-level buffers when computing
infer_auto_device_map
by @younesbelkada in #792 - Add
even_batches
keyword to Accelerator by @Chris-hughes10 in #781 - Fix device_map="auto" on CPU-only envs by @sgugger in #797
- Fix extraction of state dict in offload by @sgugger in #795
- fix: add pdsh as default launcher by @zanussbaum in #800
- Deal with optimizer.differentiable in PyTorch 1.13.0 by @comaniac in #803
- Introduce a pod-config command by @muellerzr in #802
- Refactor CLI to improve readability by @muellerzr in #810
- adding support to pickle and unpickle
AcceleratedOptimizer
by @pacman100 in #811 - add
recurse
argument inremove_hook_from_module
by @younesbelkada in #812 - Act on deprecations by @muellerzr in #813
- Mlflow-tracker-v2 🔥 by @nbroad1881 in #794
- Update CLI docs and use mps rather than mps_device by @muellerzr in #814
- Rename pod-config to tpu-config + docs by @muellerzr in #818
- Update docs by @muellerzr in #823
- rename sklearn to proper dep by @muellerzr in #825
- Rename by @muellerzr in #824
- Update pr docs actions by @mishig25 in #827
Significant community contributions
The following contributors have made significant changes to the library over the last release: