New features
- Support full-parameter RLHF training (RM & PPO)
- Refactor llmtuner core in #1525 by @hiyouga
- Better LLaMA Board: full-parameter RLHF and demo mode
New models
- Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
- Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta