microsoft/Foundry-Local v0.8.94 on GitHub

Foundry Local Release Notes: v0.8.94 🚀

✨ New Features

Improve performance for multi-turn conversations, especially time to first token, with the addition of the continuous decoding feature. Only new tokens are sent to the model instead of the entire conversation. The previous inputs and responses are saved by the model in the KV-cache.

Website showing full model list with hardware variants: https://foundrylocal.ai/models

🐛 Bug fixes

Foundry Local now defaults to --default-log-level instead of Information if --log-level is not provided. Foundry Local also elevates the level with which some errors were being written with from Information to Error.
#265
#263
#71

📝 Known issues

This version is not supported on macos. Please use the previous release for macos. Support coming soon!
If model is not found in the catalog, instead of showing a warning / suggestion message and gracefully exiting, an exception
is thrown.
When the context length is exhausted (set by the max_length value), instead of showing a warning / error message, an exception is thrown

microsoft/Foundry-Local v0.8.94 Foundry Local Release 0.8.94 on GitHub

Foundry Local Release Notes: v0.8.94 🚀

microsoft/Foundry-Local v0.8.94
Foundry Local Release 0.8.94

on GitHub