github ggml-org/llama.cpp b7876

3 hours ago
Details

hexagon: enable offloading to Hexagon on Windows on Snapdragon (#19150)

  • hexagon: updates to enable offloading to HTP on WoS

  • Update windows.md

  • Update windows.md

  • hexagon: enable -O3 optimizations

  • hexagon: move all _WINDOWS conditional compilation to _WIN32

  • hexagon: updates to enable offloading to HTP on WoS

  • hexagon: use run-time vs load-time dynamic linking for cdsp driver interface

  • refactor htp-drv

  • hexagon: add run-bench.ps1 script

  • hexagon: htdrv refactor

  • hexagon: unify Android and Windows build readmes

  • hexagon: update README.md

  • hexagon: refactor htpdrv

  • hexagon: drv refactor

  • hexagon: more drv refactor

  • hexagon: fixes for android builds

  • hexagon: factor out dl into ggml-backend-dl

  • hexagon: add run-tool.ps1 script

  • hexagon: merge htp-utils in htp-drv and remove unused code

  • wos: no need for getopt_custom.h

  • wos: add missing CR in htpdrv

  • hexagon: ndev enforecement applies only to the Android devices

  • hexagon: add support for generating and signing .cat file

  • hexagon: add .inf file

  • hexagon: working auto-signing and improved windows builds

  • hexagon: futher improve skel build

  • hexagon: add rough WoS guide

  • hexagon: updated windows guide

  • hexagon: improve cmake handling of certs and logging

  • hexagon: improve windows setup/build doc

  • hexagon: more windows readme updates

  • hexagon: windows readme updates

  • hexagon: windows readme updates

  • hexagon: windows readme updates

  • hexagon: windows readme updates

  • Update windows.md

  • Update windows.md

  • snapdragon: rename docs/backend/hexagon to docs/backends/snapdragon

Also added a power shell script to simplify build env setup.

  • hexagon: remove trailing whitespace and move cmake requirement to user-presets

  • hexagon: fix CMakeUserPresets path in workflow yaml

  • hexagon: introduce local version of libdl.h

  • hexagon: fix src1 reuse logic

gpt-oss needs a bigger lookahead window.
The check for src[1] itself being quantized was wrong.


Co-authored-by: Max Krasnyansky maxk@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.