LocalAI 2.19.1 is out! 📣

TLDR; Summary spotlight

🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options.
🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens.
🔊 TTS Integration: Text-to-Speech (TTS) is now included in the binary releases.
🛠️ Enhanced Installer: The installer script now supports setting up federated instances.
📥 Model Pulling: Models can now be pulled directly via URL.
🖼️ WebUI Enhancements: Visual improvements and cleanups to the WebUI and model lists.
🧠 llama-cpp Backend: The llama-cpp (grpc) backend now supports embedding ( https://localai.io/features/embeddings/#llamacpp-embeddings )
⚙️ Tool Support: Small enhancements to tools with disabled grammars.

🖧 LocalAI Federation and AI swarms

LocalAI is revolutionizing the future of distributed AI workloads by making it simpler and more accessible. No more complex setups, Docker or Kubernetes configurations – LocalAI allows you to create your own AI cluster with minimal friction. By auto-discovering and sharing work or weights of the LLM model across your existing devices, LocalAI aims to scale both horizontally and vertically with ease.

How it works?

Starting LocalAI with --p2p generates a shared token for connecting multiple instances: and that's all you need to create AI clusters, eliminating the need for intricate network setups. Simply navigate to the "Swarm" section in the WebUI and follow the on-screen instructions.

For fully shared instances, initiate LocalAI with --p2p --federated and adhere to the Swarm section's guidance. This feature, while still experimental, offers a tech preview quality experience.

Federated LocalAI

Launch multiple LocalAI instances and cluster them together to share requests across the cluster. The "Swarm" tab in the WebUI provides one-liner instructions on connecting various LocalAI instances using a shared token. Instances will auto-discover each other, even across different networks.

Check out a demonstration video: Watch now

LocalAI P2P Workers

Distribute weights across nodes by starting multiple LocalAI workers, currently available only on the llama.cpp backend, with plans to expand to other backends soon.

Check out a demonstration video: Watch now

What's Changed

Bug fixes 🐛

fix: make sure the GNUMake jobserver is passed to cmake for the llama.cpp build by @cryptk in #2697
Using exec when starting a backend instead of spawning a new process by @a17t in #2720
fix(cuda): downgrade default version from 12.5 to 12.4 by @mudler in #2707
fix: Lora loading by @vaaale in #2893
fix: short-circuit when nodes aren't detected by @mudler in #2909
fix: do not list txt files as potential models by @mudler in #2910

🖧 P2P area

feat(p2p): Federation and AI swarms by @mudler in #2723
feat(p2p): allow to disable DHT and use only LAN by @mudler in #2751

Exciting New Features 🎉

Allows to remove a backend from the list by @mauromorales in #2721
ci(Makefile): adds tts in binary releases by @mudler in #2695
feat: HF /scan endpoint by @dave-gray101 in #2566
feat(model-list): be consistent, skip known files from listing by @mudler in #2760
feat(models): pull models from urls by @mudler in #2750
feat(webui): show also models without a config in the welcome page by @mudler in #2772
feat(install.sh): support federated install by @mudler in #2752
feat(llama.cpp): support embeddings endpoints by @mudler in #2871
feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys by @mudler in #2912
feat(federation): add load balanced option by @mudler in #2915

🧠 Models

models(gallery): ⬆️ update checksum by @localai-bot in #2701
models(gallery): add l3-8b-everything-cot by @mudler in #2705
models(gallery): add hercules-5.0-qwen2-7b by @mudler in #2708
models(gallery): add llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix by @mudler in #2710
models(gallery): add llama-3-llamilitary by @mudler in #2711
models(gallery): add tess-v2.5-gemma-2-27b-alpha by @mudler in #2712
models(gallery): add arcee-agent by @mudler in #2713
models(gallery): add gemma2-daybreak by @mudler in #2714
models(gallery): add L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF by @mudler in #2715
models(gallery): add qwen2-7b-instruct-v0.8 by @mudler in #2717
models(gallery): add internlm2_5-7b-chat-1m by @mudler in #2719
models(gallery): add gemma-2-9b-it-sppo-iter3 by @mudler in #2722
models(gallery): add llama-3_8b_unaligned_alpha by @mudler in #2727
models(gallery): add l3-8b-lunaris-v1 by @mudler in #2729
models(gallery): add llama-3_8b_unaligned_alpha_rp_soup-i1 by @mudler in #2734
models(gallery): add hathor_respawn-l3-8b-v0.8 by @mudler in #2738
models(gallery): add llama3-8b-instruct-replete-adapted by @mudler in #2739
models(gallery): add llama-3-perky-pat-instruct-8b by @mudler in #2740
models(gallery): add l3-uncen-merger-omelette-rp-v0.2-8b by @mudler in #2741
models(gallery): add nymph_8b-i1 by @mudler in #2742
models(gallery): add smegmma-9b-v1 by @mudler in #2743
models(gallery): add hathor_tahsin-l3-8b-v0.85 by @mudler in #2762
models(gallery): add replete-coder-instruct-8b-merged by @mudler in #2782
models(gallery): add arliai-llama-3-8b-formax-v1.0 by @mudler in #2783
models(gallery): add smegmma-deluxe-9b-v1 by @mudler in #2784
models(gallery): add l3-ms-astoria-8b by @mudler in #2785
models(gallery): add halomaidrp-v1.33-15b-l3-i1 by @mudler in #2786
models(gallery): add llama-3-patronus-lynx-70b-instruct by @mudler in #2788
models(gallery): add llamax3 by @mudler in #2849
models(gallery): add arliai-llama-3-8b-dolfin-v0.5 by @mudler in #2852
models(gallery): add tiger-gemma-9b-v1-i1 by @mudler in #2853
feat: models(gallery): add deepseek-v2-lite by @mudler in #2658
models(gallery): ⬆️ update checksum by @localai-bot in #2860
models(gallery): add phi-3.1-mini-4k-instruct by @mudler in #2863
models(gallery): ⬆️ update checksum by @localai-bot in #2887
models(gallery): add ezo model series (llama3, gemma) by @mudler in #2891
models(gallery): add l3-8b-niitama-v1 by @mudler in #2895
models(gallery): add mathstral-7b-v0.1-imat by @mudler in #2901
models(gallery): add MythicalMaid/EtherealMaid 15b by @mudler in #2902
models(gallery): add flammenai/Mahou-1.3d-mistral-7B by @mudler in #2903
models(gallery): add big-tiger-gemma-27b-v1 by @mudler in #2918
models(gallery): add phillama-3.8b-v0.1 by @mudler in #2920
models(gallery): add qwen2-wukong-7b by @mudler in #2921
models(gallery): add einstein-v4-7b by @mudler in #2922
models(gallery): add gemma-2b-translation-v0.150 by @mudler in #2923
models(gallery): add emo-2b by @mudler in #2924
models(gallery): add celestev1 by @mudler in #2925

📖 Documentation and examples

⬆️ Update docs version mudler/LocalAI by @localai-bot in #2699
examples(gha): add example on how to run LocalAI in Github actions by @mudler in #2716
docs(swagger): enhance coverage of APIs by @mudler in #2753
docs(swagger): comment LocalAI gallery endpoints and rerankers by @mudler in #2854
docs: add a note on benchmarks by @mudler in #2857
docs(swagger): cover p2p endpoints by @mudler in #2862
ci: use github action by @mudler in #2899
docs: update try-it-out.md by @eltociear in #2906
docs(swagger): core more localai/openai endpoints by @mudler in #2904
docs: more swagger, update docs by @mudler in #2907
feat(swagger): update swagger by @localai-bot in #2916

👒 Dependencies

⬆️ Update ggerganov/llama.cpp by @localai-bot in #2700
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2704
deps(whisper.cpp): update to latest commit by @mudler in #2709
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2718
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2725
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2736
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2744
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2746
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2747
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2755
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2767
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2756
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2774
chore(deps): Update Dependencies by @reneleonhardt in #2538
chore(deps): Bump dependabot/fetch-metadata from 2.1.0 to 2.2.0 by @dependabot in #2791
chore(deps): Bump llama-index from 0.9.48 to 0.10.55 in /examples/chainlit by @dependabot in #2795
chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/functions by @dependabot in #2793
chore(deps): Bump nginx from 1.a.b.c to 1.27.0 in /examples/k8sgpt by @dependabot in #2790
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/coqui by @dependabot in #2798
chore(deps): Bump inflect from 7.0.0 to 7.3.1 in /backend/python/openvoice by @dependabot in #2796
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/parler-tts by @dependabot in #2797
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/petals by @dependabot in #2799
chore(deps): Bump causal-conv1d from 1.2.0.post2 to 1.4.0 in /backend/python/mamba by @dependabot in #2792
chore(deps): Bump docs/themes/hugo-theme-relearn from c25bc2a to 1b2e139 by @dependabot in #2801
chore(deps): Bump tenacity from 8.3.0 to 8.5.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #2803
chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/langchain-chroma by @dependabot in #2794
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/bark by @dependabot in #2805
chore(deps): Bump streamlit from 1.30.0 to 1.36.0 in /examples/streamlit-bot by @dependabot in #2804
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/diffusers by @dependabot in #2807
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/exllama2 by @dependabot in #2809
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/common/template by @dependabot in #2802
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/autogptq by @dependabot in #2800
chore(deps): Bump weaviate-client from 4.6.4 to 4.6.5 in /examples/chainlit by @dependabot in #2811
chore(deps): Bump gradio from 4.36.1 to 4.37.1 in /backend/python/openvoice in the pip group by @dependabot in #2815
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/vall-e-x by @dependabot in #2812
chore(deps): Bump certifi from 2024.6.2 to 2024.7.4 in /examples/langchain/langchainpy-localai-example by @dependabot in #2814
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/transformers by @dependabot in #2817
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/sentencetransformers by @dependabot in #2813
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/rerankers by @dependabot in #2819
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/parler-tts by @dependabot in #2818
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/vllm by @dependabot in #2820
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/coqui by @dependabot in #2825
chore(deps): Bump faster-whisper from 0.9.0 to 1.0.3 in /backend/python/openvoice by @dependabot in #2829
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/exllama by @dependabot in #2841
chore(deps): Bump scipy from 1.13.0 to 1.14.0 in /backend/python/transformers-musicgen by @dependabot in #2842
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2846
chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/functions by @dependabot in #2806
chore(deps): Bump mamba-ssm from 1.2.0.post1 to 2.2.2 in /backend/python/mamba by @dependabot in #2821
chore(deps): Bump pydantic from 2.7.3 to 2.8.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #2832
chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/langchain-chroma by @dependabot in #2822
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/bark by @dependabot in #2831
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/diffusers by @dependabot in #2833
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/autogptq by @dependabot in #2816
chore(deps): Bump gradio from 4.36.1 to 4.38.1 in /backend/python/openvoice by @dependabot in #2840
chore(deps): Bump the pip group across 1 directory with 2 updates by @dependabot in #2848
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/transformers by @dependabot in #2837
chore(deps): Bump sentence-transformers from 2.5.1 to 3.0.1 in /backend/python/sentencetransformers by @dependabot in #2826
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/vall-e-x by @dependabot in #2830
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/rerankers by @dependabot in #2834
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/vllm by @dependabot in #2839
chore(deps): Bump librosa from 0.9.1 to 0.10.2.post1 in /backend/python/openvoice by @dependabot in #2836
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/transformers-musicgen by @dependabot in #2843
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/mamba by @dependabot in #2808
chore(deps): Bump llama-index from 0.10.43 to 0.10.55 in /examples/langchain-chroma by @dependabot in #2810
chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #2824
chore(deps): Bump numpy from 1.26.4 to 2.0.0 in /backend/python/openvoice by @dependabot in #2823
chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/transformers-musicgen by @dependabot in #2844
build(deps): bump docker/build-push-action from 5 to 6 by @dependabot in #2592
chore(deps): Bump chromadb from 0.5.0 to 0.5.4 in /examples/langchain-chroma by @dependabot in #2828
chore(deps): Bump torch from 2.2.0 to 2.3.1 in /backend/python/mamba by @dependabot in #2835
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2851
chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/sentencetransformers by @dependabot in #2838
chore: update edgevpn dependency by @mudler in #2855
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2859
chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/functions by @dependabot in #2873
chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #2874
chore(deps): Bump numexpr from 2.10.0 to 2.10.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #2877
chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2885
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2886
chore(deps): Bump debugpy from 1.8.1 to 1.8.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #2878
chore(deps): Bump langchain-community from 0.2.5 to 0.2.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #2875
chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/langchain-chroma by @dependabot in #2872
chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/langchain/langchainpy-localai-example by @dependabot in #2876
chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2898
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2897
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2905
chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2917

Other Changes

ci: add pipelines for discord notifications by @mudler in #2703
ci(arm64): fix gRPC build by adding googletest to CMakefile by @mudler in #2754
fix: arm builds via disabling abseil tests by @dave-gray101 in #2758
ci(grpc): disable ABSEIL tests by @mudler in #2759
ci(deps): add libgmock-dev by @mudler in #2761
fix abseil test issue [attempt 3] by @dave-gray101 in #2769
feat(swagger): update swagger by @localai-bot in #2766
ci: Do not test the full matrix on PRs by @mudler in #2771
Git fetch specific branch instead of full tree during build by @LoricOSC in #2748
fix(ci): small fixups to checksum_checker.sh by @mudler in #2776
fix(ci): fixup correct path for check_and_update.py by @mudler in #2777
fixes to check_and_update.py script by @dave-gray101 in #2778
Update remaining git clones to git fetch by @LoricOSC in #2779
feat(scripts): add scripts to help adding new models to the gallery by @mudler in #2789
build: speedup git submodule update with --single-branch by @dave-gray101 in #2847
Revert "chore(deps): Bump inflect from 7.0.0 to 7.3.1 in /backend/python/openvoice" by @mudler in #2856
Revert "chore(deps): Bump librosa from 0.9.1 to 0.10.2.post1 in /backend/python/openvoice" by @mudler in #2861
feat(swagger): update swagger by @localai-bot in #2858
Revert "chore(deps): Bump numpy from 1.26.4 to 2.0.0 in /backend/python/openvoice" by @mudler in #2868
feat(swagger): update swagger by @localai-bot in #2884
fix: update grpcio version to match version used in builds by @cryptk in #2888
fix: cleanup indentation and remove duplicate dockerfile stanza by @cryptk in #2889
ci: add workflow to comment new Opened PRs by @mudler in #2892
build: fix go.mod - don't import ourself by @dave-gray101 in #2896
feat(swagger): update swagger by @localai-bot in #2908
refactor: move federated server logic to its own service by @mudler in #2914
refactor: groundwork - add pkg/concurrency and the associated test file by @dave-gray101 in #2745

New Contributors

@a17t made their first contribution in #2720
@LoricOSC made their first contribution in #2748
@vaaale made their first contribution in #2893

Full Changelog: v2.18.1...v2.19.0

mudler/LocalAI v2.19.1 on GitHub