github Mozilla-Ocho/llamafile 0.4.1
llamafile v0.4.1

latest releases: 0.8.13, 0.8.12, 0.8.11...
9 months ago

llamafile lets you distribute and run LLMs with a single file

[line drawing of llama animal head in front of slightly open manilla folder filled with files]

If you had trouble generating filenames following the "bash one-liners"
blog post using the latest release, then please try again.

  • 0984ed8 Fix regression with --grammar flag

Crashes on older Intel / AMD systems should be fixed:

  • 3490afa Fix SIGILL on older Intel/AMD CPUs w/o F16C

The OpenAI API compatible endpoint has been improved.

  • 9e4bf29 Fix OpenAI server sampling w.r.t. temp and seed

This release improves the documentation.

  • 5c7ff6e Improve llamafile manual
  • 658b18a Add WSL CUDA to GPU section (#105)
  • 586b408 Update README.md so links and curl commands work (#136)
  • a56ffd4 Update README to clarify Darwin kernel versioning
  • 47d8a8f Fix README changing SSE3 to SSSE3
  • 4da8e2e Fix README examples for certain UNIX shells
  • faa7430 Change README to list Mixtral Q5 (instead of Q3)
  • 6b0b64f Fix CLI README examples

We're making strides to automating our testing process.

Some other improvements:

  • 9e972b2 Improve README examples
  • 9de5686 Support bos token in llava-cli
  • 3d81e22 Set logger callback for Apple Metal
  • 9579b73 Make it easier to override CPPFLAGS

Our .llamafiles on Hugging Face have been updated to incorporate these
new release binaries. You can redownload here:

Known Issues

LLaVA image processing using the builtin tinyBLAS library may go slow on Windows.
Here's the workaround for using the faster NVIDIA cuBLAS library instead.

  1. Delete the .llamafile directory in your home directory.
  2. Install CUDA
  3. Install MSVC
  4. Open the "x64 MSVC command prompt" from Start
  5. Run llamafile there for the first invocation.

There's a YouTube video tutorial on doing this here: https://youtu.be/d1Fnfvat6nM?si=W6Y0miZ9zVBHySFj

Don't miss a new llamafile release

NewReleases is sending notifications on new releases.