What's Changed
- DeepSpeed inference config. (#2459) by @awan-10 in #2472
- Update docs to autogenerate pydantic config model docs by @mrwyattii in #2509
- Add max_tokens alias to max_out_tokens arg to maintain backwards compatibility by @lekurile in #2508
- Deepspeed quantization library v0.1 by @lokoppakmsft in #2450
- Fix backward compatibility for InferenceConfig by @mrwyattii in #2516
- Add missing Inference sub-configs by @mrwyattii in #2518
- Add note about nvcc/hipcc requirement by @jeffra in #2519
- Update codeowners by @jeffra in #2525
- Dequantization Utils Library by @cmikeh2 in #2521
- Fixes for torch 1.14 due to new torch.numel return type by @jeffra in #2522
- Ensure MOE is initialized for SD by @cmikeh2 in #2534
- Make DS-Inference config readable from JSON by @mrwyattii in #2537
- Add MII tests by @mrwyattii in #2533
- Remove mutable default parameter in
init_inference()
by @aphedges in #2540 - Change Where DS/Triton is Used in Stable Diffusion by @cmikeh2 in #2536
- Pass down the new DS inference config to replace_transformer_layer. by @awan-10 in #2539
- Adding Gradient Accumulation Data Type Config by @jomayeri in #2512
- Report progress at gradient accumulation boundary by @ShijieZZZZ in #2553
- encoded ds config into command line argument when launching child processes in autotuning by @cli99 in #2524
- Add missing MoE fields to inference config for backward compatibility by @mrwyattii in #2556
- Abstract accelerator (step 1) by @delock in #2504
- Fix invalid check of recorded parameter orders in zero stage3. by @inkcherry in #2550
New Contributors
- @ShijieZZZZ made their first contribution in #2553
- @delock made their first contribution in #2504
- @inkcherry made their first contribution in #2550
Full Changelog: v0.7.5...v0.7.6