Mozilla-Ocho/llamafile 0.3 on GitHub

llamafile lets you distribute and run LLMs with a single file

The llamafile-main and llamafile-llava-cli programs have been
unified into a single command named llamafile. Man pages now exist in
pdf, troff, and postscript format. There's much better support for shell
scripting, thanks to a new --silent-prompt flag. It's now possible to
shell script vision models like LLaVA using grammar constraints.

d4e2388 Add --version flag
baf216a Make ctrl-c work better
762ad79 Add make install build rule
7a3e557 Write man pages for all commands
c895a44 Remove stdout logging in llava-cli
6cb036c Make LLaVA more shell script friendly
28d3160 Introduce --silent-prompt flag to main
1cd334f Allow --grammar to be used on --image prompts

The OpenAI API in llamafile-server has been improved.

e8c92bc Make OpenAI API stop field optional (#36)
c1c8683 Avoid bind() conflicts on port 8080 w/ server
8cb9fd8 Recognize cache_prompt parameter in OpenAI API

Performance regressions have been fixed for Intel and AMD users.

73ee0b1 Add runtime dispatching for Q5 weights
36b103e Make Q2/Q3 weights go 2x faster on AMD64 AVX2 CPUs
b4dea04 Slightly speed up LLaVA runtime dispatch on Intel

The zipalign command is now feature complete.

76d47c0 Put finishing touches on zipalign tool
7b2fbcb Add support for replacing zip files to zipalign

Some additional improvements:

5f69bb9 Add SVG logo
cd0fae0 Make memory map loader go much faster on MacOS
c8cd8e1 Fix output path in llamafile-quantize
dd1e0cd Support attention_bias on LLaMA architecture
55467d9 Fix integer overflow during quantization
ff1b437 Have makefile download cosmocc automatically
a7cc180 Update grammar-parser.cpp (#48)
61944b5 Disable pledge on systems with GPUs
ccc377e Log cuda build command to stderr

Our .llamafiles on Hugging Face have been updated to incorporate these new release binaries. You can redownload here:

If you have a slower Internet connection and don't want to re-download, then you don't have to! Instructions are here:

#24 (comment)

Mozilla-Ocho/llamafile 0.3 llamafile v0.3 on GitHub

Mozilla-Ocho/llamafile 0.3
llamafile v0.3

on GitHub