github ggml-org/llama.cpp b8220

latest releases: b8224, b8223, b8222...
4 hours ago
Details

CUDA: use shared mem for ssm_conv (#20128)

  • CUDA: use shared mem for ssm_conv

  • fuse silu + ssm_conv

  • fuse unary + mul

  • enable for fp16

  • formatting

Co-authored-by: Johannes Gäßler johannesg@5d6.de


Co-authored-by: Johannes Gäßler johannesg@5d6.de

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.