This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.
- Add a new global config setting:
sendLoadingState: true - Add a new model override setting:
model.sendLoadingState: trueto control it on per model basis
Demo:
llama-swap-issue-366.mp4
Thanks to @ServeurpersoCom for the very cool idea!