github ggml-org/llama.cpp b7508

5 hours ago
Details

server: prevent data race from HTTP threads (#18263)

  • server: prevent data race from HTTP threads

  • fix params

  • fix default_generation_settings

  • nits: make handle_completions_impl looks less strange

  • stricter const

  • fix GGML_ASSERT(idx < states.size())

  • move index to be managed by server_response_reader

  • http: make sure req & res lifecycle are tied together

  • fix compile

  • fix index handling buggy

  • fix data race for lora endpoint

  • nits: fix shadow variable

  • nits: revert redundant changes

  • nits: correct naming for json_webui_settings

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.