github hiyouga/LlamaFactory v0.1.8
v0.1.8: FlashAttention-2 and Baichuan2

latest releases: v0.9.4, v0.9.3, v0.9.2...
2 years ago

New features

  • Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
  • Support training the Baichuan2 models
  • Use right-padding to avoid overflow in fp16 training (also mentioned here)
  • Align the computation method of the reward score with DeepSpeed-Chat (better generation)
  • Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

Don't miss a new LlamaFactory release

NewReleases is sending notifications on new releases.