🎉 LocalAI 3.12.0 Release! 🚀

LocalAI 3.12.0 is out!

Feature	Summary
Multi-modal Realtime	Send text, images, and audio in real-time conversations for richer interactions.
Voxtral Backend	New high-quality text-to-speech backend added.
Multi-GPU Support	Improved Diffusers performance with multiple GPUs.
Legacy CPU Optimization	Enhanced compatibility for older processors.
UI Theme & Layout	Improved UI theme (dark/light variants) and navigation
Realtime Stability	Multiple fixes for audio, image, and model handling.
Logging Improvements	Reduced excessive logs and optimized processing.

Local Stack Family

Liking LocalAI? LocalAI is part of an integrated suite of AI infrastructure tools, you might also like:

LocalAGI - AI agent orchestration platform with OpenAI Responses API compatibility and advanced agentic capabilities
LocalRecall - MCP/REST API knowledge base system providing persistent memory and storage for AI agents
🆕 Cogito - Go library for building intelligent, co-operative agentic software and LLM-powered workflows, focusing on improving results for small, open source language models that scales to any LLM. Powers LocalAGI and LocalAI MCP/Agentic capabilities
🆕 Wiz - Terminal-based AI agent accessible via Ctrl+Space keybinding. Portable, local-LLM friendly shell assistant with TUI/CLI modes, tool execution with approval, MCP protocol support, and multi-shell compatibility (zsh, bash, fish)
🆕 SkillServer - Simple, centralized skills database for AI agents via MCP. Manages skills as Markdown files with MCP server integration, web UI for editing, Git synchronization, and full-text search capabilities

❤️ Thank You

LocalAI is a true FOSS movement — built by contributors, powered by community.

If you believe in privacy-first AI:

✅ Star the repo
💬 Contribute code, docs, or feedback
📣 Share with others

Your support keeps this stack alive.

✅ Full Changelog

📋 Click to expand full changelog

What's Changed

Bug fixes 🐛

security: validate URLs to prevent SSRF in content fetching endpoints by @kolega-ai-dev in #8476
fix(realtime): Use user provided voice and allow pipeline models to have no backend by @richiejp in #8415
fix(realtime): Sampling and websocket locking by @richiejp in #8521
fix(realtime): Send proper image data to backend by @richiejp in #8547
fix: prevent excessive logging in capability detection by @localai-bot in #8552
fix(voxcpm): pin setuptools by @mudler in #8556
fix(llama-cpp): populate tensor_buft_override buffer so llama-cpp properly performs fit calculations by @cvpcs in #8560
fix: pin neutts-air to known working commit by @localai-bot in #8566
fix: improve watchdown logics by @mudler in #8591
fix(llama-cpp): Pass parameters when using embedded template by @richiejp in #8590
fix(realtime): Better support for thinking models and setting model parameters by @richiejp in #8595
fix(realtime): Limit buffer sizes to prevent DoS by @richiejp in #8596
fix(ui): improve view on mobile by @mudler in #8598
fix(diffusers): sd_embed is not always available by @mudler in #8602
fix: do not keep track model if not existing by @mudler in #8603

Exciting New Features 🎉

feat(stablediffusion-ggml): Improve legacy CPU support for stablediffusion-ggml backend by @cvpcs in #8461
feat(voxtral): add voxtral backend by @mudler in #8451
feat(diffusers): add experimental support for sd_embed-style prompt embedding by @cvpcs in #8504
chore: improve log levels verbosity by @localai-bot in #8528
feat(realtime): Allow sending text, image and audio conversation items" by @richiejp in #8524
chore: compute capabilities once by @mudler in #8555
feat(ui): left navbar, dark/light theme by @mudler in #8594
fix: multi-GPU support for Diffusers (Issue #8575) by @localai-bot in #8605

🧠 Models

chore(model gallery): Add Ministral 3 family of models (aside from base versions) by @rampa3 in #8467
chore(model gallery): add voxtral (which is only available in development) by @mudler in #8532
chore(model gallery): Add npc-llm-3-8b by @rampa3 in #8498
chore(model gallery): add nemo-asr by @mudler in #8533
chore(model gallery): add voxcpm, whisperx, moonshine-tiny by @mudler in #8534
chore(model gallery): add neutts by @mudler in #8535
chore(model gallery): add vllm-omni models by @mudler in #8536
chore(model-gallery): ⬆️ update checksum by @localai-bot in #8540
feat(gallery): Add nanbeige4.1-3b by @richiejp in #8551
chore(model-gallery): ⬆️ update checksum by @localai-bot in #8593
chore(model-gallery): ⬆️ update checksum by @localai-bot in #8600

👒 Dependencies

chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.20.0 to 1.22.0 by @dependabot[bot] in #8482
chore(deps): bump github.com/jaypipes/ghw from 0.21.2 to 0.22.0 by @dependabot[bot] in #8484
chore(deps): bump github.com/onsi/ginkgo/v2 from 2.28.0 to 2.28.1 by @dependabot[bot] in #8483
chore(deps): bump github.com/alecthomas/kong from 1.13.0 to 1.14.0 by @dependabot[bot] in #8481
chore(deps): bump github.com/openai/openai-go/v3 from 3.17.0 to 3.19.0 by @dependabot[bot] in #8485
chore: bump cogito by @mudler in #8568
fix(gallery): Use YAML v3 to avoid merging maps with incompatible keys by @richiejp in #8580
chore(deps): bump google.golang.org/grpc from 1.78.0 to 1.79.1 by @dependabot[bot] in #8583
chore(deps): bump github.com/jaypipes/ghw from 0.22.0 to 0.23.0 by @dependabot[bot] in #8587
chore(deps): bump github.com/modelcontextprotocol/go-sdk from 1.2.0 to 1.3.0 by @dependabot[bot] in #8585
chore(deps): bump cogito and add new options to the agent config by @mudler in #8601

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #8462
docs: update model gallery documentation to reference main repository by @veeceey in #8452
chore: ⬆️ Update ggml-org/whisper.cpp to 4b23ff249e7f93137cb870b28fb27818e074c255 by @localai-bot in #8463
chore: ⬆️ Update ggml-org/llama.cpp to e06088da0fa86aa444409f38dff274904931c507 by @localai-bot in #8464
chore: ⬆️ Update antirez/voxtral.c to c9e8773a2042d67c637fc492c8a655c485354080 by @localai-bot in #8477
chore: ⬆️ Update ggml-org/llama.cpp to 262364e31d1da43596fe84244fba44e94a0de64e by @localai-bot in #8479
chore: ⬆️ Update ggml-org/whisper.cpp to 764482c3175d9c3bc6089c1ec84df7d1b9537d83 by @localai-bot in #8478
chore: ⬆️ Update ggml-org/llama.cpp to 57487a64c88c152ac72f3aea09bd1cc491b2f61e by @localai-bot in #8499
chore: ⬆️ Update ggml-org/llama.cpp to 4d3daf80f8834e0eb5148efc7610513f1e263653 by @localai-bot in #8513
chore: ⬆️ Update ggml-org/llama.cpp to 338085c69e486b7155e5b03d7b5087e02c0e2528 by @localai-bot in #8538
fix: update moonshine API, add setuptools to voxcpm requirements by @mudler in #8541
chore: ⬆️ Update ggml-org/llama.cpp to 05a6f0e8946914918758db767f6eb04bc1e38507 by @localai-bot in #8553
chore: ⬆️ Update ggml-org/llama.cpp to 01d8eaa28d57bfc6d06e30072085ed0ef12e06c5 by @localai-bot in #8567
chore: ⬆️ Update ggml-org/whisper.cpp to 364c77f4ca2737e3287652e0e8a8c6dce3231bba by @localai-bot in #8576
chore: ⬆️ Update antirez/voxtral.c to 134d366c24d20c64b614a3dcc8bda2a6922d077d by @localai-bot in #8578
chore: ⬆️ Update ggml-org/llama.cpp to 27b93cbd157fc4ad94573a1fbc226d3e18ea1bb4 by @localai-bot in #8577
chore: ⬆️ Update ggml-org/llama.cpp to d612901116ab2066c7923372d4827032ff296bc4 by @localai-bot in #8588
chore: ⬆️ Update ggml-org/llama.cpp to 2b089c77580d347767f440205103e4da8ec33d89 by @localai-bot in #8592
chore: ⬆️ Update ggml-org/llama.cpp to b55dcdef5dcd74dc75c4921090e928d43453c157 by @localai-bot in #8599
chore: ⬆️ Update ggml-org/whisper.cpp to 21411d81ea736ed5d9cdea4df360d3c4b60a4adb by @localai-bot in #8606
chore: ⬆️ Update ggml-org/llama.cpp to 11c325c6e0666a30590cde390d5746a405e536b9 by @localai-bot in #8607
chore(ui): improve navigation and buttons placement by @mudler in #8608

New Contributors

@veeceey made their first contribution in #8452
@cvpcs made their first contribution in #8461
@kolega-ai-dev made their first contribution in #8476

Full Changelog: v3.11.0...v3.12.0

mudler/LocalAI v3.12.0 on GitHub