github ggml-org/llama.cpp b7964

2 hours ago
Details

model : support Step3.5-Flash (#19283)

  • Support Step3.5-Flash

  • fix: norm.weight + 1 (HF zero_centered=true)

  • step35: simplify GGUF conversion + drop redundant rope KVs

  • Address review feedback

  • rename limits -> clamp

  • Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Apply suggestion from @CISC

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • rename swiglu limits -> swiglu clamp in LLM_KV

  • avoid CI fail

  • Apply suggestions from code review

  • Apply suggestions from code review

  • disabled KV shifting for LLM_ARCH_STEP35

  • Apply suggestions from code review

  • mistakenly removed cmath

  • add model size && apply missed suggestion

  • assert partial_rotary_factors

  • fix CI errors:

  • load freq_base_swa


Co-authored-by: lvyichen lvyichen@stepfun.com
Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.