This release adds support for configuring a custom endpoint to check when the upstream server is ready. No more llama.cpp server's /health
endpoint hardcoded as a dependency. It should work now with anything that provides an OpenAI compatible API.
This release adds support for configuring a custom endpoint to check when the upstream server is ready. No more llama.cpp server's /health
endpoint hardcoded as a dependency. It should work now with anything that provides an OpenAI compatible API.
Don't miss a new llama-swap release
NewReleases is sending notifications on new releases.