- Adapt DPO training from the TRL library
- Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
- Implement the "safe" ChatML template for Qwen-7B-Chat
- Better Web UI
- Pretty readme by @codemayq #382
- New features: #395 #451
- Fix InternLM-7B inference #312
- Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426