A minor release to fix the bugs in ALiBi, Falcon-40B, and Code Llama.
What's Changed
- fix "tansformers_module" ModuleNotFoundError when load model with
trust_remote_code=True
by @Jingru in #871 - Fix wrong dtype in PagedAttentionWithALiBi bias by @Yard1 in #996
- fix: CUDA error when inferencing with Falcon-40B base model by @kyujin-cho in #992
- [Docs] Update installation page by @WoosukKwon in #1005
- Update setup.py by @WoosukKwon in #1006
- Use FP32 in RoPE initialization by @WoosukKwon in #1004
- Bump up the version to v0.1.7 by @WoosukKwon in #1013
New Contributors
- @Jingru made their first contribution in #871
- @kyujin-cho made their first contribution in #992
Full Changelog: v0.1.6...v0.1.7