What's Changed
- Implement min-p sampling by @EricLBuehler in #625
- Tweak handling when PA cannot allocate by @EricLBuehler in #632
- Update deps by @EricLBuehler in #633
- Improve penalty context window calculation by @EricLBuehler in #636
- Allow setting PagedAttention KV cache allocation from context size by @EricLBuehler in #640
- Bump version to 0.2.3 by @EricLBuehler in #638
Full Changelog: v0.2.2...v0.2.3
Install mistralrs-server 0.2.3
Install prebuilt binaries via shell script
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/EricLBuehler/mistral.rs/releases/download/v0.2.3/mistralrs-server-installer.sh | sh
Download mistralrs-server 0.2.3
File | Platform | Checksum |
---|---|---|
mistralrs-server-aarch64-apple-darwin.tar.xz | Apple Silicon macOS | checksum |
mistralrs-server-x86_64-apple-darwin.tar.xz | Intel macOS | checksum |
mistralrs-server-x86_64-unknown-linux-gnu.tar.xz | x64 Linux | checksum |