What's new in 0.2.2 (2023-08-25)
These are the changes in inference v0.2.2.
New features
- FEAT: Support Llama-2 PyTorch model by @jiayini1119 in #387
- FEAT: code-llama by @UranusSeven in #402
Enhancements
- ENH: Update max_tokens to 32k by @Bojun-Feng in #386
Bug fixes
- BUG: last token is duplicated by @UranusSeven in #398
Documentation
Others
- fix chatglm params by @Bojun-Feng in #400
Full Changelog: v0.2.1...v0.2.2