deepspeed 0.8.1 on Python PyPI

What's Changed

CUDA optional deepspeed ops by @tjruwase in #2507
Remove CI trigger for push to master by @mrwyattii in #2712
[install] only add deepspeed pkg at install by @jeffra in #2714
Fix nightly tests for new lm-eval release by @mrwyattii in #2713
BF16 optimizer for BF16+ZeRO Stage 1 by @jomayeri in #2706
Fix typo in diffusers transformer block by @mrwyattii in #2718
Inference Refactor (replace_with_policy, model_implementations) by @awan-10 in #2554
Change zero_grad() argument to match pytorch by @loadams in #2741
Automatic tensor parallelism v2 by @molly-smith in #2670
Fixing Optimizer Sanity Check by @jomayeri in #2742
[GatheredParameters] fix memory leak by @stas00 in #2665
Abstract accelerator (step 3) by @delock in #2677
Fix autotuning so that it records Floating Point Operations per second, not microsecond by @dashstander in #2711
fix a misspelled attribute by @stas00 in #2750
[zero] remove misleading dtype log by @jeffra in #2732
Fix softmax backward by @RezaYazdaniAminabadi in #2709
Skip test_bias_gelu unit test if torch < 1.12 by @lekurile in #2754
Conditionally Make Op Building More Verbose by @cmikeh2 in #2759
Bing/formatting correction by @xiexbing in #2764
Add links to new azureML examples by @cassieesvelt in #2756
Fix hardcoded instances to fp16 in optimizer creation log messages to the correct dtype. by @loadams in #2743
Refactor/Pydantify monitoring config by @mrwyattii in #2640
Pin minimum packaging requirement by @carmocca in #2771
Fix for diffusers v0.12.0 by @mrwyattii in #2753
some fix in flops_profiler by @lucasleesw in #2068
fix upsample flops compute by skipping unused kargs by @cli99 in #2773
Fix broken kernel inject bug by @molly-smith in #2776
Fix Checkpoint-loading with Meta-tensor by @RezaYazdaniAminabadi in #2781
Add hjson support for user configs by @mrwyattii in #2783
Reset KV-cache at the beginning of text-generation by @RezaYazdaniAminabadi in #2669
Container param cleanup + remove qkv_merging by @lekurile in #2780
Common location to install libaio-dev by @tjruwase in #2779
Fixing broken link to azureml-examples recipes by @rtanase in #2795
remove outdated comment by @stas00 in #2786
Enable page-locked tensors without CUDA by @tjruwase in #2775
Add container load checkpoint error reporting + refactor by @lekurile in #2792
Add user defined launcher args for PDSH launcher by @loadams in #2804
Fix Slurm launcher user args by @loadams in #2806
Handle hanged tests in CI by @mrwyattii in #2808
Fix inference CI device error by @mrwyattii in #2824
Fix permissions issue with pip upgrade by @mrwyattii in #2823
Fix cpu-only CI hangs by @mrwyattii in #2825
Fix Pipeline Parallel resize unit test by @mrwyattii in #2833
Fix auto TP for duplicate modules with different gems by @molly-smith in #2784
Refactor DS inference API. No longer need replace_method. by @awan-10 in #2831
Port Reza's INT8-quantization fix to container architecture by @lekurile in #2725
Fix gpt-Neox rotary embedding implementation by @RezaYazdaniAminabadi in #2782
Fix for CI failure on system upgrade by @mrwyattii in #2849

New Contributors

@loadams made their first contribution in #2741
@xiexbing made their first contribution in #2764
@carmocca made their first contribution in #2771
@lucasleesw made their first contribution in #2068
@rtanase made their first contribution in #2795

Full Changelog: v0.8.0...v0.8.1

deepspeed 0.8.1 v0.8.1: Patch release on Python PyPI

What's Changed

New Contributors

deepspeed 0.8.1
v0.8.1: Patch release

on Python PyPI