github ggml-org/llama.cpp b7582

latest releases: b7595, b7593, b7591...
one day ago
Details

sampling: reuse token data buffer in llama_sampler_sample (#18365)

  • sampling: reuse token data buffer in llama_sampler_sample

  • move cur buffer before timing section, after samplers

  • minor : fix build


Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.