microsoft/Foundry-Local v0.8.101 on GitHub

Foundry Local Release Notes: v0.8.101 🚀

✨ New Features

Improve performance for multi-turn conversations on macOS, especially time to first token, with the addition of the continuous decoding feature. Only new tokens are sent to the model instead of the entire conversation. The previous inputs and responses are saved by the model in the KV-cache.

📝 Known issues

When the context length is exhausted (set by the max_length value), instead of showing a warning / error message, an exception is thrown

microsoft/Foundry-Local v0.8.101 Foundry Local Release 0.8.101 on GitHub

Foundry Local Release Notes: v0.8.101 🚀

microsoft/Foundry-Local v0.8.101
Foundry Local Release 0.8.101

on GitHub