hpcaitech/ColossalAI v0.4.3 on GitHub

What's Changed

Release

[release] update version (#6041) by Hongxin Liu

Fp8

[fp8] disable all_to_all_fp8 in intranode (#6045) by Hanks
[fp8] fix linear hook (#6046) by Hongxin Liu
[fp8] optimize all-gather (#6043) by Hongxin Liu
[FP8] unsqueeze scale to make it compatible with torch.compile (#6040) by Guangyao Zhang
Merge pull request #6012 from hpcaitech/feature/fp8_comm by Hongxin Liu
Merge pull request #6033 from wangbluo/fix by Wang Binluo
Merge pull request #6024 from wangbluo/fix_merge by Wang Binluo
Merge pull request #6023 from wangbluo/fp8_merge by Wang Binluo
[fp8] Merge feature/fp8_comm to main branch of Colossalai (#6016) by Wang Binluo
[fp8] zero support fp8 linear. (#6006) by flybird11111
[fp8] add use_fp8 option for MoeHybridParallelPlugin (#6009) by Wang Binluo
[fp8]update reduce-scatter test (#6002) by flybird11111
[fp8] linear perf enhancement by botbw
[fp8] update torch.compile for linear_fp8 to >= 2.4.0 (#6004) by botbw
[fp8] support asynchronous FP8 communication (#5997) by flybird11111
[fp8] refactor fp8 linear with compile (#5993) by Hongxin Liu
[fp8] support hybrid parallel plugin (#5982) by Wang Binluo
[fp8]Moe support fp8 communication (#5977) by flybird11111
[fp8] use torch compile (torch >= 2.3.0) (#5979) by botbw
[fp8] support gemini plugin (#5978) by Hongxin Liu
[fp8] support fp8 amp for hybrid parallel plugin (#5975) by Hongxin Liu
[fp8] add fp8 linear (#5967) by Hongxin Liu
[fp8]support all2all fp8 (#5953) by flybird11111
[FP8] rebase main (#5963) by flybird11111
Merge pull request #5961 from ver217/feature/zeor-fp8 by Hanks
[fp8] add fp8 comm for low level zero by ver217

Hotfix

[Hotfix] Remove deprecated install (#6042) by Tong Li
[Hotfix] Fix llama fwd replacement bug (#6031) by Wenxuan Tan
[Hotfix] Avoid fused RMSnorm import error without apex (#5985) by Edenzzzz
[Hotfix] README link (#5966) by Tong Li
[hotfix] Remove unused plan section (#5957) by Tong Li

Colossalai/checkpoint_io/...

[colossalai/checkpoint_io/...] fix bug in load_state_dict_into_model; format error msg (#6020) by Gao, Ruiyuan

Colossal-llama

[Colossal-LLaMA] Refactor latest APIs (#6030) by Tong Li

Plugin

[plugin] hotfix zero plugin (#6036) by Hongxin Liu
[plugin] add cast inputs option for zero (#6003) (#6022) by Hongxin Liu
[plugin] add cast inputs option for zero (#6003) by Hongxin Liu

Ci

[CI] Remove triton version for compatibility bug; update req torch >=2.2 (#6018) by Wenxuan Tan

Pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]
[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]
[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]
[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]
[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]
[pre-commit.ci] pre-commit autoupdate (#5995) by pre-commit-ci[bot]
[pre-commit.ci] auto fixes from pre-commit.com hooks by pre-commit-ci[bot]

Colossalchat

[ColossalChat] Add PP support (#6001) by Tong Li

Misc

[misc] Use dist logger in plugins (#6011) by Edenzzzz
[misc] update compatibility (#6008) by Hongxin Liu
[misc] Bypass the huggingface bug to solve the mask mismatch problem (#5991) by Haze188
[misc] remove useless condition by haze188
[misc] fix ci failure: change default value to false in moe plugin by haze188
[misc] remove incompatible test config by haze188
[misc] remove debug/print code by haze188
[misc] skip redunant test by haze188
[misc] solve booster hang by rename the variable by haze188

Feature

[Feature] Zigzag Ring attention (#5905) by Edenzzzz
[Feature]: support FP8 communication in DDP, FSDP, Gemini (#5928) by Hanks
[Feature] llama shardformer fp8 support (#5938) by Guangyao Zhang
[Feature] MoE Ulysses Support (#5918) by Haze188

Chat

[Chat] fix readme (#5989) by YeAnbang
Merge pull request #5962 from hpcaitech/colossalchat by YeAnbang
[Chat] Fix lora (#5946) by YeAnbang

Test ci

[test ci]Feature/fp8 comm (#5981) by flybird11111

Docs

[Docs] clarify launch port by Edenzzzz

Test

[test] add zero fp8 test case by ver217
[test] add check by hxwang
[test] fix test: test_zero1_2 by hxwang
[test] add mixtral modelling test by botbw
[test] pass mixtral shardformer test by botbw
[test] mixtra pp shard test by hxwang
[test] add mixtral transformer test by hxwang
[test] add mixtral for sequence classification by hxwang

Lora

[lora] lora support hybrid parallel plugin (#5956) by Wang Binluo

Feat

[feat] Dist Loader for Eval (#5950) by Tong Li

Chore

[chore] remove redundant test case, print string & reduce test tokens by botbw
[chore] docstring by hxwang
[chore] change moe_pg_mesh to private by hxwang
[chore] solve moe ckpt test failure and some other arg pass failure by hxwang
[chore] minor fix after rebase by hxwang
[chore] minor fix by hxwang
[chore] arg pass & remove drop token by hxwang
[chore] trivial fix by botbw
[chore] manually revert unintended commit by botbw
[chore] handle non member group by hxwang

Moe

[moe] solve dp axis issue by botbw
[moe] remove force_overlap_comm flag and add warning instead by hxwang
Revert "[moe] implement submesh initialization" by hxwang
[moe] refactor mesh assignment by hxwang
[moe] deepseek moe sp support by haze188
[moe] remove ops by hxwang
[moe] full test for deepseek and mixtral (pp + sp to fix) by hxwang
[moe] finalize test (no pp) by hxwang
[moe] init moe plugin comm setting with sp by hxwang
[moe] clean legacy code by hxwang
[moe] test deepseek by hxwang
[moe] implement tp by botbw
[moe] add mixtral dp grad scaling when not all experts are activated by botbw
[moe] implement submesh initialization by botbw
[moe] implement transit between non moe tp and ep by botbw
[moe] fix plugin by hxwang

Doc

[doc] add MoeHybridParallelPlugin docstring by botbw

Deepseek

[deepseek] replace attn (a workaround for bug in transformers) by hxwang

Bug

[bug] fix: somehow logger hangs the program by botbw

Zero

[zero] solve hang by botbw
[zero] solve hang by hxwang

Full Changelog: v0.4.3...v0.4.2

hpcaitech/ColossalAI v0.4.3 Version v0.4.3 Release Today! on GitHub

What's Changed

Release

Fp8

Hotfix

Colossalai/checkpoint_io/...

Colossal-llama

Plugin

Ci

Pre-commit.ci

Colossalchat

Misc

Feature

Chat

Test ci

Docs

Test

Lora

Feat

Chore

Moe

Doc

Deepseek

Bug

Zero

hpcaitech/ColossalAI v0.4.3
Version v0.4.3 Release Today!

on GitHub