What's Changed
- Add and update template READMEs by @EricLBuehler in #405
- Improve Rust crates docs by @EricLBuehler in #406
- Expose phi3v loader and remove unused deps by @EricLBuehler in #408
- Support GGUF Mixtral format where experts are in one tensor by @EricLBuehler in #355
- Refactor with normal loading metadata for vision models by @EricLBuehler in #409
- Phi 3 vision ISQ support by @EricLBuehler in #410
- Remove causal masks cache by @EricLBuehler in #412
- Fix: use new slice_assign by @EricLBuehler in #415
- Fix Phi-3 GGUF by @EricLBuehler in #414
- Implement gpt2 (BPE) GGUF tokenizer conversion by @EricLBuehler in #397
- Support chat template from GGUF by @EricLBuehler in #416
- Expose API to specify dtype during loading by @EricLBuehler in #417
- Lock candle version to commit by @EricLBuehler in #419
- Bump version to 0.1.17 by @EricLBuehler in #420
Full Changelog: v0.1.16...v0.1.17