llamafile lets you distribute and run LLMs with a single file
If you had trouble generating filenames following the "bash one-liners"
blog post using the latest release, then please try again.
- 0984ed8 Fix regression with --grammar flag
Crashes on older Intel / AMD systems should be fixed:
- 3490afa Fix SIGILL on older Intel/AMD CPUs w/o F16C
The OpenAI API compatible endpoint has been improved.
- 9e4bf29 Fix OpenAI server sampling w.r.t. temp and seed
This release improves the documentation.
- 5c7ff6e Improve llamafile manual
- 658b18a Add WSL CUDA to GPU section (#105)
- 586b408 Update README.md so links and curl commands work (#136)
- a56ffd4 Update README to clarify Darwin kernel versioning
- 47d8a8f Fix README changing SSE3 to SSSE3
- 4da8e2e Fix README examples for certain UNIX shells
- faa7430 Change README to list Mixtral Q5 (instead of Q3)
- 6b0b64f Fix CLI README examples
We're making strides to automating our testing process.
Some other improvements:
- 9e972b2 Improve README examples
- 9de5686 Support bos token in llava-cli
- 3d81e22 Set logger callback for Apple Metal
- 9579b73 Make it easier to override CPPFLAGS
Our .llamafiles on Hugging Face have been updated to incorporate these
new release binaries. You can redownload here:
- https://huggingface.co/jartine/llava-v1.5-7B-GGUF/tree/main
- https://huggingface.co/jartine/Mistral-7B-Instruct-v0.2-llamafile/tree/main
- https://huggingface.co/jartine/wizardcoder-13b-python/tree/main
- https://huggingface.co/jartine/Mixtral-8x7B-Instruct-v0.1-llamafile
Known Issues
LLaVA image processing using the builtin tinyBLAS library may go slow on Windows.
Here's the workaround for using the faster NVIDIA cuBLAS library instead.
- Delete the
.llamafiledirectory in your home directory. - Install CUDA
- Install MSVC
- Open the "x64 MSVC command prompt" from Start
- Run llamafile there for the first invocation.
There's a YouTube video tutorial on doing this here: https://youtu.be/d1Fnfvat6nM?si=W6Y0miZ9zVBHySFj
![[line drawing of llama animal head in front of slightly open manilla folder filled with files]](https://private-user-images.githubusercontent.com/49262/289660212-bbcb0dde-4cd9-431a-9f79-ccb5ecd912d6.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQ1NTA4NTAsIm5iZiI6MTcyNDU1MDU1MCwicGF0aCI6Ii80OTI2Mi8yODk2NjAyMTItYmJjYjBkZGUtNGNkOS00MzFhLTlmNzktY2NiNWVjZDkxMmQ2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MjUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODI1VDAxNDkxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNiZDQ1NzgyNzgwYjY2NmRkMDIwNDI5NjFiMzU2YTgxYzVjNWQzZTU2Zjc1NTZmYWZjNGJlMDUyOWRlZmZhYTMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.ZJChkw2Y_M2t3Qp8pWOnV6L-1DMa6cjQBanY_4LfX6E)