Details
server : add thinking content blocks to Anthropic Messages API (#18551)
- server : add thinking content blocks to Anthropic Messages API
Add support for returning reasoning/thinking content in Anthropic API
responses when using models with --reasoning-format deepseek and the
thinking parameter enabled.
- Non-streaming: adds thinking block before text in content array
- Streaming: emits thinking_delta events with correct block indices
- Partial streaming: tracks reasoning state across chunks via
anthropic_has_reasoning member variable
Tested with bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF model.
- server : fix Anthropic API streaming for thinking content blocks
Add signature field and fix duplicate content_block_start events in
Anthropic Messages API streaming responses for reasoning models.
- server: refactor Anthropic streaming state to avoid raw pointer
Replace raw pointer to task_result_state with direct field copies:
- Copy state fields in update() before processing chunk
- Use local copies in to_json_anthropic() instead of dereferencing
- Pre-compute state updates for next chunk in update()
This makes the data flow clearer and avoids unsafe pointer patterns.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: