What's Changed
Release
- [release] update version (#5654) by Hongxin Liu
- [release] grok-1 inference benchmark (#5500) by binmakeswell
- [release] grok-1 314b inference (#5490) by binmakeswell
Hotfix
- [hotfix] add soft link to support required files (#5661) by Tong Li
- [hotfix] Fixed fused layernorm bug without apex (#5609) by Edenzzzz
- [hotfix] Fix examples no pad token & auto parallel codegen bug; (#5606) by Edenzzzz
- [hotfix] fix typo s/get_defualt_parser /get_default_parser (#5548) by digger yu
- [hotfix] quick fixes to make legacy tutorials runnable (#5559) by Edenzzzz
- [hotfix] set return_outputs=False in examples and polish code (#5404) by Wenhao Chen
- [hotfix] fix typo s/keywrods/keywords etc. (#5429) by digger yu
News
- [news] llama3 and open-sora v1.1 (#5655) by binmakeswell
Lazyinit
- [lazyinit] skip whisper test (#5653) by Hongxin Liu
Shardformer
- [shardformer] refactor pipeline grad ckpt config (#5646) by Hongxin Liu
- [shardformer] fix chatglm implementation (#5644) by Hongxin Liu
- [shardformer] remove useless code (#5645) by flybird11111
- [shardformer] update transformers (#5583) by Wang Binluo
- [shardformer] fix pipeline grad ckpt (#5620) by Hongxin Liu
- [shardformer] refactor embedding resize (#5603) by flybird11111
- [shardformer] Sequence Parallelism Optimization (#5533) by Zhongkai Zhao
- [shardformer] fix pipeline forward error if custom layer distribution is used (#5189) by Insu Jang
- [shardformer] update colo attention to support custom mask (#5510) by Hongxin Liu
- [shardformer]Fix lm parallel. (#5480) by flybird11111
- [shardformer] fix gathering output when using tensor parallelism (#5431) by flybird11111
Fix
- [Fix]: implement thread-safety singleton to avoid deadlock for very large-scale training scenarios (#5625) by Season
- [fix] fix typo s/muiti-node /multi-node etc. (#5448) by digger yu
- [Fix] Grok-1 use tokenizer from the same pretrained path (#5532) by Yuanheng Zhao
- [fix] fix grok-1 example typo (#5506) by Yuanheng Zhao
Coloattention
- [coloattention]modify coloattention (#5627) by flybird11111
Example
- [example] llama3 (#5631) by binmakeswell
- [example] update Grok-1 inference (#5495) by Yuanheng Zhao
- [example] add grok-1 inference (#5485) by Hongxin Liu
Exampe
- [exampe] update llama example (#5626) by Hongxin Liu
Feature
Zero
- [zero] support multiple (partial) backward passes (#5596) by Hongxin Liu
Doc
- [doc] fix ColossalMoE readme (#5599) by Camille Zhong
- [doc] update open-sora demo (#5479) by binmakeswell
- [doc] release Open-Sora 1.0 with model weights (#5468) by binmakeswell
Devops
- [devops] remove post commit ci (#5566) by Hongxin Liu
- [devops] fix example test ci (#5504) by Hongxin Liu
- [devops] fix compatibility (#5444) by Hongxin Liu
Shardformer, pipeline
- [shardformer, pipeline] add
gradient_checkpointing_ratio
and heterogenous shard policy for llama (#5508) by Wenhao Chen
Colossalchat
Format
- [format] applied code formatting on changed files in pull request 5510 (#5517) by github-actions[bot]
Full Changelog: v0.3.7...v0.3.6