New features
- Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
- Support training the Baichuan2 models
- Use right-padding to avoid overflow in fp16 training (also mentioned here)
- Align the computation method of the reward score with DeepSpeed-Chat (better generation)
- Support
--lora_target all
argument which automatically finds the applicable modules for LoRA training
Bug fix
- Use efficient EOS tokens to align with the Baichuan training ( baichuan-inc/Baichuan2#23 )
- Remove PeftTrainer to save model checkpoints in DeepSpeed training
- Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
- Add dataset explanation by @panpan0000 in #629
- Fix a bug in the DPO data collator
- Fix a bug of the ChatGLM2 tokenizer in right-padding
- #608 #617 #649 #757 #761 #763 #809 #818