ggml-org/llama.cpp b8030
on GitHub

latest releases: b8033, b8032

2 hours ago

Details

CUDA: Do not mutate cgraph for fused ADDs (#19566)

Do not mutate cgraph for fused ADDs

We should try to minimize in-place changes to the incoming
ggml_cgraph where possible (those should happen in graph_optimize)
Modifying in-place leads to an additional, unnecessary graph capture
step as we store the properties before modifying the graph in-place
in the cuda-backend

Assert ggml_tensor is trivially copyable
Update ggml/src/ggml-cuda/ggml-cuda.cu

Co-authored-by: Aman Gupta amangupta052@gmail.com

Co-authored-by: Aman Gupta amangupta052@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8030

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications