hpcaitech/ColossalAI v0.3.4 on GitHub

What's Changed

Release

[release] update version (#4995) by Hongxin Liu

Pipeline inference

[Pipeline Inference] Merge pp with tp (#4993) by Bin Jia
[Pipeline inference] Combine kvcache with pipeline inference (#4938) by Bin Jia
[Pipeline Inference] Sync pipeline inference branch to main (#4820) by Bin Jia

Doc

[doc] add supported feature diagram for hybrid parallel plugin (#4996) by ppt0011
[doc]Update doc for colossal-inference (#4989) by Cuiqing Li (李崔卿)
Merge pull request #4889 from ppt0011/main by ppt0011
[doc] add reminder for issue encountered with hybrid adam by ppt0011
[doc] update advanced tutorials, training gpt with hybrid parallelism (#4866) by flybird11111
Merge pull request #4858 from Shawlleyw/main by ppt0011
[doc] update slack link (#4823) by binmakeswell
[doc] add lazy init docs (#4808) by Hongxin Liu
Merge pull request #4805 from TongLi3701/docs/fix by Desperado-Jia
[doc] polish shardformer doc (#4779) by Baizhou Zhang
[doc] add llama2 domain-specific solution news (#4789) by binmakeswell

Hotfix

[hotfix] fix the bug of repeatedly storing param group (#4951) by Baizhou Zhang
[hotfix] Fix the bug where process groups were not being properly released. (#4940) by littsk
[hotfix] fix torch 2.0 compatibility (#4936) by Hongxin Liu
[hotfix] fix lr scheduler bug in torch 2.0 (#4864) by Baizhou Zhang
[hotfix] fix bug in sequence parallel test (#4887) by littsk
[hotfix] Correct several erroneous code comments (#4794) by littsk
[hotfix] fix norm type error in zero optimizer (#4795) by littsk
[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800) by Chandler-Bing

Kernels

[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965) by Cuiqing Li

Inference

[Inference] Dynamic Batching Inference, online and offline (#4953) by Jianghai
[Inference]ADD Bench Chatglm2 script (#4963) by Jianghai
[inference] add reference and fix some bugs (#4937) by Xu Kai
[inference] Add smmoothquant for llama (#4904) by Xu Kai
[inference] add llama2 support (#4898) by Xu Kai
[inference]fix import bug and delete down useless init (#4830) by Jianghai

Test

[test] merge old components to test to model zoo (#4945) by Hongxin Liu
[test] add no master test for low level zero plugin (#4934) by Zhongkai Zhao
Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero by ppt0011
[test] modify model supporting part of low_level_zero plugin (including correspoding docs) by Zhongkai Zhao

Refactor

[Refactor] Integrated some lightllm kernels into token-attention (#4946) by Cuiqing Li

Nfc

[nfc] fix some typo with colossalai/ docs/ etc. (#4920) by digger yu
[nfc] fix minor typo in README (#4846) by Blagoy Simandoff
[NFC] polish code style (#4799) by Camille Zhong
[NFC] polish colossalai/inference/quant/gptq/cai_gptq/init.py code style (#4792) by Michelle

Format

[format] applied code formatting on changed files in pull request 4820 (#4886) by github-actions[bot]
[format] applied code formatting on changed files in pull request 4908 (#4918) by github-actions[bot]
[format] applied code formatting on changed files in pull request 4595 (#4602) by github-actions[bot]

Gemini

[gemini] support gradient accumulation (#4869) by Baizhou Zhang
[gemini] support amp o3 for gemini (#4872) by Hongxin Liu

Kernel

[kernel] support pure fp16 for cpu adam and update gemini optim tests (#4921) by Hongxin Liu

Feature

[feature] support no master weights option for low level zero plugin (#4816) by Zhongkai Zhao
[feature] Add clip_grad_norm for hybrid_parallel_plugin (#4837) by littsk
[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786) by Yuanchen

Checkpointio

[checkpointio] hotfix torch 2.0 compatibility (#4824) by Hongxin Liu
[checkpointio] support unsharded checkpointIO for hybrid parallel (#4774) by Baizhou Zhang

Infer

[infer] fix test bug (#4838) by Xu Kai
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) by Yuanheng Zhao
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771) by Yuanheng Zhao

Chat

[chat] fix gemini strategy (#4698) by flybird11111

Misc

[misc] add last_epoch in CosineAnnealingWarmupLR (#4778) by Yan haixu

Lazy

[lazy] support from_pretrained (#4801) by Hongxin Liu

Fix

[fix] fix weekly runing example (#4787) by flybird11111

Full Changelog: v0.3.4...v0.3.3

hpcaitech/ColossalAI v0.3.4 Version v0.3.4 Release Today! on GitHub

What's Changed

Release

Pipeline inference

Doc

Hotfix

Kernels

Inference

Test

Refactor

Nfc

Format

Gemini

Kernel

Feature

Checkpointio

Infer

Chat

Misc

Lazy

Fix

hpcaitech/ColossalAI v0.3.4
Version v0.3.4 Release Today!

on GitHub