🚀 LocalAI 3.1
🚀 Highlights
Support for Gemma 3n!
Gemma 3n has been released and it's now available in LocalAI (currently only for text generation, install it with:
local-ai run gemma-3n-e2b-it
local-ai run gemma-3n-e4b-it
⚠️ Breaking Changes
Several important changes that reduce image size, simplify the ecosystem, and pave the way for a leaner LocalAI core:
🧰 Container Image Changes
- Sources are no longer bundled in the container images. This significantly reduces image sizes.
- Need to rebuild locally? Just follow the docs to build from scratch. We're working towards migrating all backends to the gallery, slimming down the default image further.
📁 Directory Structure Updated
New default model and backend paths for container images:
- Models:
/models/
(was/build/models
) - Backends:
/backends/
(was/build/backends
)
🏷 Unified Image Tag Naming for master
(development) builds
We've cleaned up and standardized container image tags for clarity and consistency:
gpu-nvidia-cuda11
andgpu-nvidia-cuda12
(previouslycublas-cuda11
,cublas-cuda12
)gpu-intel-f16
andgpu-intel-f32
(previouslysycl-f16
,sycl-f32
)
Meta packages in backend galleries
We’ve introduced meta-packages to the backend gallery!
These packages automatically install the most suitable backend depending on the GPU detected in your system — saving time, reducing errors, and ensuring you get the right setup out of the box. These will be added as soon as the 3.1.0 images are going to be published, stay tuned!
For instance, you will be able to install vllm
just by installing the vllm
backend in the gallery ( no need to select anymore the correct GPU version)
The Complete Local Stack for Privacy-First AI
With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:
![]() LocalAI |
The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required. |
![]() LocalAGI |
A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI. |
![]() LocalRecall |
A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI. |
Join the Movement! ❤️
A massive THANK YOU to our incredible community and our sponsors! LocalAI has over 33,500 stars, and LocalAGI has already rocketed past 800+ stars!
As a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time and our sponsors to provide us the hardware! If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!
👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI
Full changelog 👇
👉 Click to expand 👈
What's Changed
Breaking Changes 🛠
- chore(ci): ⚠️ fix latest tag by using docker meta action by @mudler in #5722
- feat: ⚠️ reduce images size and stop bundling sources by @mudler in #5721
Bug fixes 🐛
Exciting New Features 🎉
🧠 Models
- chore(model gallery): add qwen3-the-josiefied-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5704
- chore(model gallery): add menlo_jan-nano by @mudler in #5705
- chore(model gallery): add qwen3-the-xiaolong-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5706
- chore(model gallery): add allura-org_q3-8b-kintsugi by @mudler in #5707
- chore(model gallery): add ds-r1-qwen3-8b-arliai-rpr-v4-small-iq-imatrix by @mudler in #5708
- chore(model gallery): add mistralai_mistral-small-3.2-24b-instruct-2506 by @mudler in #5714
- chore(model gallery): add skywork_skywork-swe-32b by @mudler in #5715
- chore(model gallery): add astrosage-70b by @mudler in #5716
- chore(model gallery): add delta-vector_austral-24b-winton by @mudler in #5717
- chore(model gallery): add menlo_jan-nano-128k by @mudler in #5723
- chore(model gallery): add gemma-3n-e2b-it by @mudler in #5730
- chore(model gallery): add gemma-3n-e4b-it by @mudler in #5731
👒 Dependencies
- chore: ⬆️ Update ggml-org/whisper.cpp to
3e65f518ddf840b13b74794158aa95a2c8aa30cc
by @localai-bot in #5691 - chore: ⬆️ Update ggml-org/llama.cpp to
8f71d0f3e86ccbba059350058af8758cafed73e6
by @localai-bot in #5692 - chore: ⬆️ Update ggml-org/llama.cpp to
06cbedfca1587473df9b537f1dd4d6bfa2e3de13
by @localai-bot in #5697 - chore: ⬆️ Update ggml-org/whisper.cpp to
e6c10cf3d5d60dc647eb6cd5e73d3c347149f746
by @localai-bot in #5702 - chore: ⬆️ Update ggml-org/llama.cpp to
aa0ef5c578eef4c2adc7be1282f21bab5f3e8d26
by @localai-bot in #5703 - chore: ⬆️ Update ggml-org/llama.cpp to
238005c2dc67426cf678baa2d54c881701693288
by @localai-bot in #5710 - chore: ⬆️ Update ggml-org/whisper.cpp to
a422176937c5bb20eb58d969995765f90d3c1a9b
by @localai-bot in #5713 - chore: ⬆️ Update ggml-org/llama.cpp to
ce82bd0117bd3598300b3a089d13d401b90279c7
by @localai-bot in #5712 - chore: ⬆️ Update ggml-org/llama.cpp to
73e53dc834c0a2336cd104473af6897197b96277
by @localai-bot in #5719 - chore: ⬆️ Update ggml-org/whisper.cpp to
0083335ba0e9d6becbe0958903b0a27fc2ebaeed
by @localai-bot in #5718 - chore: ⬆️ Update leejet/stable-diffusion.cpp to
10c6501bd05a697e014f1bee3a84e5664290c489
by @localai-bot in #4925 - chore: ⬆️ Update ggml-org/llama.cpp to
2bf9d539dd158345e3a3b096e16474af535265b4
by @localai-bot in #5724 - chore: ⬆️ Update ggml-org/whisper.cpp to
4daf7050ca2bf17f5166f45ac6da651c4e33f293
by @localai-bot in #5725 - Revert "chore: ⬆️ Update leejet/stable-diffusion.cpp to
10c6501bd05a697e014f1bee3a84e5664290c489
" by @mudler in #5727 - chore: ⬆️ Update ggml-org/llama.cpp to
8846aace4934ad29651ea61b8c7e3f6b0556e3d2
by @localai-bot in #5734 - chore: ⬆️ Update ggml-org/whisper.cpp to
32cf4e2aba799aff069011f37ca025401433cf9f
by @localai-bot in #5733
Other Changes
Full Changelog: v3.0.0...v3.1.0