What's Changed
Features
- [MOE] changed parallelmode to dist process group by @1SAA in #460
- [MOE] redirect moe_env from global_variables to core by @1SAA in #467
- [zero] zero init ctx receives a dp process group by @ver217 in #471
- [zero] ZeRO supports pipeline parallel by @ver217 in #477
- add LinearGate for MOE in NaiveAMP context by @1SAA in #480
- [zero] polish sharded param name by @feifeibear in #484
- [zero] sharded optim support hybrid cpu adam by @ver217 in #486
- [zero] polish sharded optimizer v2 by @ver217 in #490
- [MOE] support PR-MOE by @1SAA in #488
- [zero] sharded model manages ophooks individually by @ver217 in #492
- [MOE] remove old MoE legacy by @1SAA in #493
- [zero] sharded model support the reuse of fp16 shard by @ver217 in #495
- [polish] polish singleton and global context by @feifeibear in #500
- [memory] add model data tensor moving api by @feifeibear in #503
- [memory] set cuda mem frac by @feifeibear in #506
- [zero] use colo model data api in sharded optimv2 by @feifeibear in #511
- [MOE] add MOEGPT model by @1SAA in #510
- [zero] zero init ctx enable rm_torch_payload_on_the_fly by @ver217 in #512
- [zero] show model data cuda memory usage after zero context init. by @feifeibear in #515
- [log] polish disable_existing_loggers by @ver217 in #519
- [zero] add model data tensor inline moving API by @feifeibear in #521
- [cuda] modify the fused adam, support hybrid of fp16 and fp32 by @Gy-Lu in #497
- [zero] refactor model data tracing by @feifeibear in #522
- [zero] added hybrid adam, removed loss scale in adam by @Gy-Lu in #527
Bug Fix
- fix discussion buttom in issue template by @binmakeswell in #504
- [zero] fix grad offload by @feifeibear in #528
Unit Testing
- [MOE] add unitest for MOE experts layout, gradient handler and kernel by @1SAA in #469
- [test] added rerun on exception for testing by @FrankLeeeee in #475
- [zero] fix init device bug in zero init context unittest by @feifeibear in #516
- [test] fixed rerun_on_exception and adapted test cases by @FrankLeeeee in #487
CI/CD
- [devops] remove tsinghua source for pip by @FrankLeeeee in #505
- [devops] remove tsinghua source for pip by @FrankLeeeee in #507
- [devops] recover tsinghua pip source due to proxy issue by @FrankLeeeee in #509
Documentation
- [doc] update rst by @ver217 in #470
- Update Experiment result about Colossal-AI with ZeRO by @Sze-qq in #479
- [doc] docs get correct release version by @ver217 in #489
- Update README.md by @fastalgo in #514
- [doc] update apidoc by @ver217 in #530
Model Zoo
- [model zoo] fix attn mask shape of gpt by @ver217 in #472
- [model zoo] gpt embedding remove attn mask by @ver217 in #474
Miscellaneous
- [install] run with out rich by @feifeibear in #513
- [refactor] remove old zero code by @feifeibear in #517
- [format] polish name format for MOE by @feifeibear in #481
New Contributors
Full Changelog: v0.1.0...v0.1.1