ggml-org/llama.cpp b7690
on GitHub

latest releases: b9330, b9329, b9326...

4 months ago

Details

server: fix n_cmpl not skipping processing prompt (#18663)

server: fix n_cmpl not skipping processing
fix infinite loop on empty batch
cont : init child samplers + modify child logic
cont : cleanup
cont : improve n_cmpl logic

launch the parent task first so it finds the slot with best cache
parent task waits for child tasks to be launched
when a child task finishes - remove its cache

cont : remove redundant function
cont : reduce parent checks
fix : nullptr task dereference

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b7690

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications