🚀 LocalAI 3.1

🚀 Highlights

Support for Gemma 3n!

Gemma 3n has been released and it's now available in LocalAI (currently only for text generation, install it with:

local-ai run gemma-3n-e2b-it
local-ai run gemma-3n-e4b-it

⚠️ Breaking Changes

Several important changes that reduce image size, simplify the ecosystem, and pave the way for a leaner LocalAI core:

🧰 Container Image Changes

Sources are no longer bundled in the container images. This significantly reduces image sizes.
- Need to rebuild locally? Just follow the docs to build from scratch. We're working towards migrating all backends to the gallery, slimming down the default image further.

📁 Directory Structure Updated

New default model and backend paths for container images:

Models: /models/ (was /build/models)
Backends: /backends/ (was /build/backends)

🏷 Unified Image Tag Naming for `master` (development) builds

We've cleaned up and standardized container image tags for clarity and consistency:

gpu-nvidia-cuda11 and gpu-nvidia-cuda12 (previously cublas-cuda11, cublas-cuda12)
gpu-intel-f16 and gpu-intel-f32 (previously sycl-f16, sycl-f32)

Meta packages in backend galleries

We’ve introduced meta-packages to the backend gallery!
These packages automatically install the most suitable backend depending on the GPU detected in your system — saving time, reducing errors, and ensuring you get the right setup out of the box. These will be added as soon as the 3.1.0 images are going to be published, stay tuned!

For instance, you will be able to install vllm just by installing the vllm backend in the gallery ( no need to select anymore the correct GPU version)

The Complete Local Stack for Privacy-First AI

With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall

Join the Movement! ❤️

A massive THANK YOU to our incredible community and our sponsors! LocalAI has over 33,500 stars, and LocalAGI has already rocketed past 800+ stars!

As a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time and our sponsors to provide us the hardware! If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!

👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI

Full changelog 👇

👉 Click to expand 👈

What's Changed

Breaking Changes 🛠

chore(ci): ⚠️ fix latest tag by using docker meta action by @mudler in #5722
feat: ⚠️ reduce images size and stop bundling sources by @mudler in #5721

Bug fixes 🐛

fix(backends gallery): delete dangling dirs if installation failed by @mudler in #5729

Exciting New Features 🎉

feat(backend gallery): add meta packages by @mudler in #5696

🧠 Models

chore(model gallery): add qwen3-the-josiefied-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5704
chore(model gallery): add menlo_jan-nano by @mudler in #5705
chore(model gallery): add qwen3-the-xiaolong-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5706
chore(model gallery): add allura-org_q3-8b-kintsugi by @mudler in #5707
chore(model gallery): add ds-r1-qwen3-8b-arliai-rpr-v4-small-iq-imatrix by @mudler in #5708
chore(model gallery): add mistralai_mistral-small-3.2-24b-instruct-2506 by @mudler in #5714
chore(model gallery): add skywork_skywork-swe-32b by @mudler in #5715
chore(model gallery): add astrosage-70b by @mudler in #5716
chore(model gallery): add delta-vector_austral-24b-winton by @mudler in #5717
chore(model gallery): add menlo_jan-nano-128k by @mudler in #5723
chore(model gallery): add gemma-3n-e2b-it by @mudler in #5730
chore(model gallery): add gemma-3n-e4b-it by @mudler in #5731

👒 Dependencies

chore: ⬆️ Update ggml-org/whisper.cpp to 3e65f518ddf840b13b74794158aa95a2c8aa30cc by @localai-bot in #5691
chore: ⬆️ Update ggml-org/llama.cpp to 8f71d0f3e86ccbba059350058af8758cafed73e6 by @localai-bot in #5692
chore: ⬆️ Update ggml-org/llama.cpp to 06cbedfca1587473df9b537f1dd4d6bfa2e3de13 by @localai-bot in #5697
chore: ⬆️ Update ggml-org/whisper.cpp to e6c10cf3d5d60dc647eb6cd5e73d3c347149f746 by @localai-bot in #5702
chore: ⬆️ Update ggml-org/llama.cpp to aa0ef5c578eef4c2adc7be1282f21bab5f3e8d26 by @localai-bot in #5703
chore: ⬆️ Update ggml-org/llama.cpp to 238005c2dc67426cf678baa2d54c881701693288 by @localai-bot in #5710
chore: ⬆️ Update ggml-org/whisper.cpp to a422176937c5bb20eb58d969995765f90d3c1a9b by @localai-bot in #5713
chore: ⬆️ Update ggml-org/llama.cpp to ce82bd0117bd3598300b3a089d13d401b90279c7 by @localai-bot in #5712
chore: ⬆️ Update ggml-org/llama.cpp to 73e53dc834c0a2336cd104473af6897197b96277 by @localai-bot in #5719
chore: ⬆️ Update ggml-org/whisper.cpp to 0083335ba0e9d6becbe0958903b0a27fc2ebaeed by @localai-bot in #5718
chore: ⬆️ Update leejet/stable-diffusion.cpp to 10c6501bd05a697e014f1bee3a84e5664290c489 by @localai-bot in #4925
chore: ⬆️ Update ggml-org/llama.cpp to 2bf9d539dd158345e3a3b096e16474af535265b4 by @localai-bot in #5724
chore: ⬆️ Update ggml-org/whisper.cpp to 4daf7050ca2bf17f5166f45ac6da651c4e33f293 by @localai-bot in #5725
Revert "chore: ⬆️ Update leejet/stable-diffusion.cpp to 10c6501bd05a697e014f1bee3a84e5664290c489" by @mudler in #5727
chore: ⬆️ Update ggml-org/llama.cpp to 8846aace4934ad29651ea61b8c7e3f6b0556e3d2 by @localai-bot in #5734
chore: ⬆️ Update ggml-org/whisper.cpp to 32cf4e2aba799aff069011f37ca025401433cf9f by @localai-bot in #5733

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #5690
chore(ci): try to optimize disk space when tagging latest by @mudler in #5695
chore(ci): add stale bot by @mudler in #5700
Docs: Fix typos by @kilavvy in #5709

Full Changelog: v3.0.0...v3.1.0

mudler/LocalAI v3.1.0 on GitHub