github ggml-org/llama.cpp b7275

latest releases: b7278, b7276
7 hours ago

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

metal: TRI, FILL, EXPM1, SOFTPLUS (#16623)

  • feat(wip): Port initial TRI impl from pervious work

The kernel does not work and is not optimized, but the
code compiles and runs, so this will be the starting point
now that the core op has been merged.

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • fix: Remove argument for constant val override

This was added in the original draft, but later removed. With this, the
kernel now passes tests.

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • feat: Move the ttype conditional to templating to avoid conditional in kernel

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • fix: Type fixes

Signed-off-by: Gabe Goodhart ghart@us.ibm.com
Co-authored-by: Georgi Gerganov ggerganov@gmail.com

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

  • feat: Add softplus for metal

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • feat: Add EXPM1 for metal

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • feat: Add FILL for metal

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • refactor: Branchless version of tri using _ggml_vec_tri_cmp as a mask

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • fix: Remove unused arguments

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com

  • refactor: Use select instead of branch for softplus non-vec

Branch: ggml-cumsum-tri

Signed-off-by: Gabe Goodhart ghart@us.ibm.com


Signed-off-by: Gabe Goodhart ghart@us.ibm.com
Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.