Mozilla-Ocho/llamafile 0.8.4 on GitHub

This release fixes underflows and overflows.

A memory bug in the grammar parser has been fixed, that caused commands like ./llamafile -m foo.gguf -p bar --grammar 'root::="' (which failed to specify a closing quote) to crash. Anyone using the server as a public facing endpoint (despite our previous recommendations) is strongly encouraged to upgrade. See 22aba95 and 3fe045f. Credit for discovering (and most importantly, reporting) this issue goes to Eclypsium Security Researcher Richard Johnson. We incorrectly reported earlier that this fix was incorporated into the v0.8.2 release. You need to use the v0.8.4 release. This bug fix was upstreamed in ggerganov/llama.cpp#7194
Our new vectorized expf() implementation now handles underflow by producing subnormals rather than flushing to zero. b5c6df6

See these instructions for how to put the latest llamafile software into your old weights, without having to redownload. #24 (comment)

Mozilla-Ocho/llamafile 0.8.4 llamafile v0.8.4 on GitHub