github hpcaitech/ColossalAI v0.4.3
Version v0.4.3 Release Today!

8 days ago

What's Changed

Release

Fp8

Hotfix

Colossalai/checkpoint_io/...

  • [colossalai/checkpoint_io/...] fix bug in load_state_dict_into_model; format error msg (#6020) by Gao, Ruiyuan

Colossal-llama

Plugin

Ci

  • [CI] Remove triton version for compatibility bug; update req torch >=2.2 (#6018) by Wenxuan Tan

Pre-commit.ci

Colossalchat

Misc

  • [misc] Use dist logger in plugins (#6011) by Edenzzzz
  • [misc] update compatibility (#6008) by Hongxin Liu
  • [misc] Bypass the huggingface bug to solve the mask mismatch problem (#5991) by Haze188
  • [misc] remove useless condition by haze188
  • [misc] fix ci failure: change default value to false in moe plugin by haze188
  • [misc] remove incompatible test config by haze188
  • [misc] remove debug/print code by haze188
  • [misc] skip redunant test by haze188
  • [misc] solve booster hang by rename the variable by haze188

Feature

Chat

Test ci

Docs

  • [Docs] clarify launch port by Edenzzzz

Test

  • [test] add zero fp8 test case by ver217
  • [test] add check by hxwang
  • [test] fix test: test_zero1_2 by hxwang
  • [test] add mixtral modelling test by botbw
  • [test] pass mixtral shardformer test by botbw
  • [test] mixtra pp shard test by hxwang
  • [test] add mixtral transformer test by hxwang
  • [test] add mixtral for sequence classification by hxwang

Lora

Feat

Chore

  • [chore] remove redundant test case, print string & reduce test tokens by botbw
  • [chore] docstring by hxwang
  • [chore] change moe_pg_mesh to private by hxwang
  • [chore] solve moe ckpt test failure and some other arg pass failure by hxwang
  • [chore] minor fix after rebase by hxwang
  • [chore] minor fix by hxwang
  • [chore] arg pass & remove drop token by hxwang
  • [chore] trivial fix by botbw
  • [chore] manually revert unintended commit by botbw
  • [chore] handle non member group by hxwang

Moe

  • [moe] solve dp axis issue by botbw
  • [moe] remove force_overlap_comm flag and add warning instead by hxwang
  • Revert "[moe] implement submesh initialization" by hxwang
  • [moe] refactor mesh assignment by hxwang
  • [moe] deepseek moe sp support by haze188
  • [moe] remove ops by hxwang
  • [moe] full test for deepseek and mixtral (pp + sp to fix) by hxwang
  • [moe] finalize test (no pp) by hxwang
  • [moe] init moe plugin comm setting with sp by hxwang
  • [moe] clean legacy code by hxwang
  • [moe] test deepseek by hxwang
  • [moe] implement tp by botbw
  • [moe] add mixtral dp grad scaling when not all experts are activated by botbw
  • [moe] implement submesh initialization by botbw
  • [moe] implement transit between non moe tp and ep by botbw
  • [moe] fix plugin by hxwang

Doc

  • [doc] add MoeHybridParallelPlugin docstring by botbw

Deepseek

  • [deepseek] replace attn (a workaround for bug in transformers) by hxwang

Bug

  • [bug] fix: somehow logger hangs the program by botbw

Zero

  • [zero] solve hang by botbw
  • [zero] solve hang by hxwang

Full Changelog: v0.4.3...v0.4.2

Don't miss a new ColossalAI release

NewReleases is sending notifications on new releases.