EricLBuehler/mistral.rs v0.3.1 on GitHub

Highlights

The MSRV of this release is 1.79.0.

Enable automatic determination of normal loader type by @EricLBuehler in #742
Add the ForwardInputsResult api by @EricLBuehler in #745
Implement Mixture of Quantized Experts (MoQE) by @EricLBuehler in #747
Bump quinn-proto from 0.11.6 to 0.11.8 by @dependabot in #748
Fix f64-f32 type mismatch for Metal/Accelerate by @EricLBuehler in #752
Nicer error when misconfigured PagedAttention input metadata by @EricLBuehler in #753
Update deps, support CUDA 12.6 by @EricLBuehler in #755
Patch bug when not using PagedAttention by @EricLBuehler in #759
Fix MistralRs Drop impl in tokio runtime by @EricLBuehler in #762
Use nicer Candle Error APIs by @EricLBuehler in #767
Support setting seed by @EricLBuehler in #766
Fix Metal build error with seed by @EricLBuehler in #771
Fix and add checks for no kv cache by @EricLBuehler in #776
UQFF: The uniquely powerful quantized file format. by @EricLBuehler in #770
Add Scheduler::running_len by @EricLBuehler in #780
Deduplicate RoPE caches by @EricLBuehler in #787
Easier and simpler Rust-side API by @EricLBuehler in #785
Add some examples for AnyMoE by @EricLBuehler in #788
Rust API for sampling by @EricLBuehler in #790
Our first Diffusion model: FLUX by @EricLBuehler in #758
Fix build bugs with metal, NSUInteger by @EricLBuehler in #792
Support weight tying in Llama 3.2 GGUF models by @EricLBuehler in #801
Implement the Llama 3.2 vision models by @EricLBuehler in #796

Full Changelog: v0.3.0...v0.3.1