Foundry Local Release Notes: v0.8.94 🚀
✨ New Features
Improve performance for multi-turn conversations, especially time to first token, with the addition of the continuous decoding feature. Only new tokens are sent to the model instead of the entire conversation. The previous inputs and responses are saved by the model in the KV-cache.
Website showing full model list with hardware variants: https://foundrylocal.ai/models
🐛 Bug fixes
- Foundry Local now defaults to
--default-log-levelinstead ofInformationif--log-levelis not provided. Foundry Local also elevates the level with which some errors were being written with fromInformationtoError. - #265
- #263
- #71
📝 Known issues
- This version is not supported on macos. Please use the previous release for macos. Support coming soon!
- If model is not found in the catalog, instead of showing a warning / suggestion message and gracefully exiting, an exception
is thrown. - When the context length is exhausted (set by the max_length value), instead of showing a warning / error message, an exception is thrown