github ggml-org/llama.cpp b7582

latest releases: b8168, b8167, b8166...
one month ago
Details

sampling: reuse token data buffer in llama_sampler_sample (#18365)

  • sampling: reuse token data buffer in llama_sampler_sample

  • move cur buffer before timing section, after samplers

  • minor : fix build


Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.