EricLBuehler/mistral.rs v0.2.5 on GitHub

What's Changed

Refactor ISQ quant parsing by @EricLBuehler in #664
Refactor server examples to use OpenAI Python client by @EricLBuehler in #665
Implement prompt chunking by @EricLBuehler in #623
Python example and server example cleanup by @EricLBuehler in #668
Implement GPTQ quantization by @EricLBuehler in #467
Update deps by @EricLBuehler in #672
Rework the automatic dtype selection feature by @EricLBuehler in #676
Fix backend Candle fork Metal, flash attn, also Llama linear by @EricLBuehler in #681
Use converted tokenizer.json in tests by @EricLBuehler in #682
Refactor ISQ and mistralrs-quant by @EricLBuehler in #683
Fix metal build for isq by @EricLBuehler in #686
Add missing error case in automatic dtype selection feature by @ac3xx in #685
fix null in tool type response by @wseaton in #687
Implement HQQ quantization by @EricLBuehler in #677
Bump version to 0.2.5 by @EricLBuehler in #688

Full Changelog: v0.2.4...v0.2.5

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/EricLBuehler/mistral.rs/releases/download/v0.2.5/mistralrs-server-installer.sh | sh

File	Platform	Checksum
mistralrs-server-aarch64-apple-darwin.tar.xz	Apple Silicon macOS	checksum
mistralrs-server-x86_64-apple-darwin.tar.xz	Intel macOS	checksum
mistralrs-server-x86_64-unknown-linux-gnu.tar.xz	x64 Linux	checksum