github ggml-org/llama.cpp b8330

latest releases: b8337, b8336, b8334...
5 hours ago
Details

server: reset counter related to kill-switch on client error (#20513)

  • server: reset kill-switch on client error

This avoids triggering a server kill switch.

If the client sends a request that exceeds the configured context size, an appropriate HTTP 400 response is provided and no tokens are generated.

However since no tokens are generated, update_slots() increments n_empty_consecutive. If the client sends 3 such messages in a row, the server terminates.

  • moved counter reset as per recommendation

  • cont : minor


Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.