3.0.0-beta.18 (2024-05-09)
Bug Fixes
- more efficient max context size finding algorithm (#214) (453c162)
- make embedding-only models work correctly (#214) (453c162)
- perform context shift on the correct token index on generation (#214) (453c162)
- make context loading work for all models on Electron (#214) (453c162)
Features
- split gguf files support (#214) (453c162)
pullcommand (#214) (453c162)stopOnAbortSignalandcustomStopTriggersonLlamaChatandLlamaChatSession(#214) (453c162)checkTensorsparameter onloadModel(#214) (453c162)- improve Electron support (#214) (453c162)
Shipped with llama.cpp release b2834
To use the latest
llama.cpprelease available, runnpx --no node-llama-cpp download --release latest. (learn more)