🔧 Bug fixes
This release fixes several issues with the new llama.cpp loader, especially on Windows. Thanks everyone for the feedback.
- Fix the poor performance of the new llama.cpp loader on Windows. It was caused by using
localhost
for requests instead of127.0.0.1
. It's a lot faster now. - Fix the new llama.cpp loader failing to unload models.
- Fix using the API without streaming or without 'sampler_priority' when using the new llama.cpp loader.