github hiyouga/LLaMA-Factory v0.1.8
v0.1.8: FlashAttention-2 and Baichuan2

latest releases: v0.9.0, v0.8.3, v0.8.2...
14 months ago

New features

  • Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
  • Support training the Baichuan2 models
  • Use right-padding to avoid overflow in fp16 training (also mentioned here)
  • Align the computation method of the reward score with DeepSpeed-Chat (better generation)
  • Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

Don't miss a new LLaMA-Factory release

NewReleases is sending notifications on new releases.