mostlygeek/llama-swap v171 on GitHub

This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.

Add a new global config setting: sendLoadingState: true
Add a new model override setting: model.sendLoadingState: true to control it on per model basis

Demo:

llama-swap-issue-366.mp4

Thanks to @ServeurpersoCom for the very cool idea!

Changelog

a89b803 Stream loading state when swapping models (#371)