ggml-org/llama.cpp b7964
on GitHub

2 hours ago

Details

model : support Step3.5-Flash (#19283)

Support Step3.5-Flash
fix: norm.weight + 1 (HF zero_centered=true)
step35: simplify GGUF conversion + drop redundant rope KVs
Address review feedback
rename limits -> clamp
Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

Apply suggestion from @CISC

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

rename swiglu limits -> swiglu clamp in LLM_KV
avoid CI fail
Apply suggestions from code review
Apply suggestions from code review
disabled KV shifting for LLM_ARCH_STEP35
Apply suggestions from code review
mistakenly removed cmath
add model size && apply missed suggestion
assert partial_rotary_factors
fix CI errors:
load freq_base_swa

Co-authored-by: lvyichen lvyichen@stepfun.com
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b7964

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications