LocalAI 2.19.1 is out! 📣
TLDR; Summary spotlight
- 🖧 Federated Instances via P2P: LocalAI now supports federated instances with P2P, offering both load-balanced and non-load-balanced options.
- 🎛️ P2P Dashboard: A new dashboard to guide and assist in setting up P2P instances with auto-discovery using shared tokens.
- 🔊 TTS Integration: Text-to-Speech (TTS) is now included in the binary releases.
- 🛠️ Enhanced Installer: The installer script now supports setting up federated instances.
- 📥 Model Pulling: Models can now be pulled directly via URL.
- 🖼️ WebUI Enhancements: Visual improvements and cleanups to the WebUI and model lists.
- 🧠 llama-cpp Backend: The llama-cpp (grpc) backend now supports embedding ( https://localai.io/features/embeddings/#llamacpp-embeddings )
- ⚙️ Tool Support: Small enhancements to tools with disabled grammars.
🖧 LocalAI Federation and AI swarms
LocalAI is revolutionizing the future of distributed AI workloads by making it simpler and more accessible. No more complex setups, Docker or Kubernetes configurations – LocalAI allows you to create your own AI cluster with minimal friction. By auto-discovering and sharing work or weights of the LLM model across your existing devices, LocalAI aims to scale both horizontally and vertically with ease.
How it works?
Starting LocalAI with --p2p
generates a shared token for connecting multiple instances: and that's all you need to create AI clusters, eliminating the need for intricate network setups. Simply navigate to the "Swarm" section in the WebUI and follow the on-screen instructions.
For fully shared instances, initiate LocalAI with --p2p --federated
and adhere to the Swarm section's guidance. This feature, while still experimental, offers a tech preview quality experience.
Federated LocalAI
Launch multiple LocalAI instances and cluster them together to share requests across the cluster. The "Swarm" tab in the WebUI provides one-liner instructions on connecting various LocalAI instances using a shared token. Instances will auto-discover each other, even across different networks.
Check out a demonstration video: Watch now
LocalAI P2P Workers
Distribute weights across nodes by starting multiple LocalAI workers, currently available only on the llama.cpp backend, with plans to expand to other backends soon.
Check out a demonstration video: Watch now
What's Changed
Bug fixes 🐛
- fix: make sure the GNUMake jobserver is passed to cmake for the llama.cpp build by @cryptk in #2697
- Using exec when starting a backend instead of spawning a new process by @a17t in #2720
- fix(cuda): downgrade default version from 12.5 to 12.4 by @mudler in #2707
- fix: Lora loading by @vaaale in #2893
- fix: short-circuit when nodes aren't detected by @mudler in #2909
- fix: do not list txt files as potential models by @mudler in #2910
🖧 P2P area
- feat(p2p): Federation and AI swarms by @mudler in #2723
- feat(p2p): allow to disable DHT and use only LAN by @mudler in #2751
Exciting New Features 🎉
- Allows to remove a backend from the list by @mauromorales in #2721
- ci(Makefile): adds tts in binary releases by @mudler in #2695
- feat: HF
/scan
endpoint by @dave-gray101 in #2566 - feat(model-list): be consistent, skip known files from listing by @mudler in #2760
- feat(models): pull models from urls by @mudler in #2750
- feat(webui): show also models without a config in the welcome page by @mudler in #2772
- feat(install.sh): support federated install by @mudler in #2752
- feat(llama.cpp): support embeddings endpoints by @mudler in #2871
- feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys by @mudler in #2912
- feat(federation): add load balanced option by @mudler in #2915
🧠 Models
- models(gallery): ⬆️ update checksum by @localai-bot in #2701
- models(gallery): add l3-8b-everything-cot by @mudler in #2705
- models(gallery): add hercules-5.0-qwen2-7b by @mudler in #2708
- models(gallery): add llama3-8b-darkidol-2.2-uncensored-1048k-iq-imatrix by @mudler in #2710
- models(gallery): add llama-3-llamilitary by @mudler in #2711
- models(gallery): add tess-v2.5-gemma-2-27b-alpha by @mudler in #2712
- models(gallery): add arcee-agent by @mudler in #2713
- models(gallery): add gemma2-daybreak by @mudler in #2714
- models(gallery): add L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF by @mudler in #2715
- models(gallery): add qwen2-7b-instruct-v0.8 by @mudler in #2717
- models(gallery): add internlm2_5-7b-chat-1m by @mudler in #2719
- models(gallery): add gemma-2-9b-it-sppo-iter3 by @mudler in #2722
- models(gallery): add llama-3_8b_unaligned_alpha by @mudler in #2727
- models(gallery): add l3-8b-lunaris-v1 by @mudler in #2729
- models(gallery): add llama-3_8b_unaligned_alpha_rp_soup-i1 by @mudler in #2734
- models(gallery): add hathor_respawn-l3-8b-v0.8 by @mudler in #2738
- models(gallery): add llama3-8b-instruct-replete-adapted by @mudler in #2739
- models(gallery): add llama-3-perky-pat-instruct-8b by @mudler in #2740
- models(gallery): add l3-uncen-merger-omelette-rp-v0.2-8b by @mudler in #2741
- models(gallery): add nymph_8b-i1 by @mudler in #2742
- models(gallery): add smegmma-9b-v1 by @mudler in #2743
- models(gallery): add hathor_tahsin-l3-8b-v0.85 by @mudler in #2762
- models(gallery): add replete-coder-instruct-8b-merged by @mudler in #2782
- models(gallery): add arliai-llama-3-8b-formax-v1.0 by @mudler in #2783
- models(gallery): add smegmma-deluxe-9b-v1 by @mudler in #2784
- models(gallery): add l3-ms-astoria-8b by @mudler in #2785
- models(gallery): add halomaidrp-v1.33-15b-l3-i1 by @mudler in #2786
- models(gallery): add llama-3-patronus-lynx-70b-instruct by @mudler in #2788
- models(gallery): add llamax3 by @mudler in #2849
- models(gallery): add arliai-llama-3-8b-dolfin-v0.5 by @mudler in #2852
- models(gallery): add tiger-gemma-9b-v1-i1 by @mudler in #2853
- feat: models(gallery): add deepseek-v2-lite by @mudler in #2658
- models(gallery): ⬆️ update checksum by @localai-bot in #2860
- models(gallery): add phi-3.1-mini-4k-instruct by @mudler in #2863
- models(gallery): ⬆️ update checksum by @localai-bot in #2887
- models(gallery): add ezo model series (llama3, gemma) by @mudler in #2891
- models(gallery): add l3-8b-niitama-v1 by @mudler in #2895
- models(gallery): add mathstral-7b-v0.1-imat by @mudler in #2901
- models(gallery): add MythicalMaid/EtherealMaid 15b by @mudler in #2902
- models(gallery): add flammenai/Mahou-1.3d-mistral-7B by @mudler in #2903
- models(gallery): add big-tiger-gemma-27b-v1 by @mudler in #2918
- models(gallery): add phillama-3.8b-v0.1 by @mudler in #2920
- models(gallery): add qwen2-wukong-7b by @mudler in #2921
- models(gallery): add einstein-v4-7b by @mudler in #2922
- models(gallery): add gemma-2b-translation-v0.150 by @mudler in #2923
- models(gallery): add emo-2b by @mudler in #2924
- models(gallery): add celestev1 by @mudler in #2925
📖 Documentation and examples
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2699
- examples(gha): add example on how to run LocalAI in Github actions by @mudler in #2716
- docs(swagger): enhance coverage of APIs by @mudler in #2753
- docs(swagger): comment LocalAI gallery endpoints and rerankers by @mudler in #2854
- docs: add a note on benchmarks by @mudler in #2857
- docs(swagger): cover p2p endpoints by @mudler in #2862
- ci: use github action by @mudler in #2899
- docs: update try-it-out.md by @eltociear in #2906
- docs(swagger): core more localai/openai endpoints by @mudler in #2904
- docs: more swagger, update docs by @mudler in #2907
- feat(swagger): update swagger by @localai-bot in #2916
👒 Dependencies
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2700
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2704
- deps(whisper.cpp): update to latest commit by @mudler in #2709
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2718
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2725
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2736
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2744
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2746
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2747
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2755
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2767
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2756
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2774
- chore(deps): Update Dependencies by @reneleonhardt in #2538
- chore(deps): Bump dependabot/fetch-metadata from 2.1.0 to 2.2.0 by @dependabot in #2791
- chore(deps): Bump llama-index from 0.9.48 to 0.10.55 in /examples/chainlit by @dependabot in #2795
- chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/functions by @dependabot in #2793
- chore(deps): Bump nginx from 1.a.b.c to 1.27.0 in /examples/k8sgpt by @dependabot in #2790
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/coqui by @dependabot in #2798
- chore(deps): Bump inflect from 7.0.0 to 7.3.1 in /backend/python/openvoice by @dependabot in #2796
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/parler-tts by @dependabot in #2797
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/petals by @dependabot in #2799
- chore(deps): Bump causal-conv1d from 1.2.0.post2 to 1.4.0 in /backend/python/mamba by @dependabot in #2792
- chore(deps): Bump docs/themes/hugo-theme-relearn from
c25bc2a
to1b2e139
by @dependabot in #2801 - chore(deps): Bump tenacity from 8.3.0 to 8.5.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #2803
- chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/langchain-chroma by @dependabot in #2794
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/bark by @dependabot in #2805
- chore(deps): Bump streamlit from 1.30.0 to 1.36.0 in /examples/streamlit-bot by @dependabot in #2804
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/diffusers by @dependabot in #2807
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/exllama2 by @dependabot in #2809
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/common/template by @dependabot in #2802
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/autogptq by @dependabot in #2800
- chore(deps): Bump weaviate-client from 4.6.4 to 4.6.5 in /examples/chainlit by @dependabot in #2811
- chore(deps): Bump gradio from 4.36.1 to 4.37.1 in /backend/python/openvoice in the pip group by @dependabot in #2815
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/vall-e-x by @dependabot in #2812
- chore(deps): Bump certifi from 2024.6.2 to 2024.7.4 in /examples/langchain/langchainpy-localai-example by @dependabot in #2814
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/transformers by @dependabot in #2817
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/sentencetransformers by @dependabot in #2813
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/rerankers by @dependabot in #2819
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/parler-tts by @dependabot in #2818
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/vllm by @dependabot in #2820
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/coqui by @dependabot in #2825
- chore(deps): Bump faster-whisper from 0.9.0 to 1.0.3 in /backend/python/openvoice by @dependabot in #2829
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/exllama by @dependabot in #2841
- chore(deps): Bump scipy from 1.13.0 to 1.14.0 in /backend/python/transformers-musicgen by @dependabot in #2842
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2846
- chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/functions by @dependabot in #2806
- chore(deps): Bump mamba-ssm from 1.2.0.post1 to 2.2.2 in /backend/python/mamba by @dependabot in #2821
- chore(deps): Bump pydantic from 2.7.3 to 2.8.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #2832
- chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/langchain-chroma by @dependabot in #2822
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/bark by @dependabot in #2831
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/diffusers by @dependabot in #2833
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/autogptq by @dependabot in #2816
- chore(deps): Bump gradio from 4.36.1 to 4.38.1 in /backend/python/openvoice by @dependabot in #2840
- chore(deps): Bump the pip group across 1 directory with 2 updates by @dependabot in #2848
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/transformers by @dependabot in #2837
- chore(deps): Bump sentence-transformers from 2.5.1 to 3.0.1 in /backend/python/sentencetransformers by @dependabot in #2826
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/vall-e-x by @dependabot in #2830
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/rerankers by @dependabot in #2834
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/vllm by @dependabot in #2839
- chore(deps): Bump librosa from 0.9.1 to 0.10.2.post1 in /backend/python/openvoice by @dependabot in #2836
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/transformers-musicgen by @dependabot in #2843
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/mamba by @dependabot in #2808
- chore(deps): Bump llama-index from 0.10.43 to 0.10.55 in /examples/langchain-chroma by @dependabot in #2810
- chore(deps): Bump langchain from 0.2.3 to 0.2.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #2824
- chore(deps): Bump numpy from 1.26.4 to 2.0.0 in /backend/python/openvoice by @dependabot in #2823
- chore(deps): Bump grpcio from 1.64.0 to 1.64.1 in /backend/python/transformers-musicgen by @dependabot in #2844
- build(deps): bump docker/build-push-action from 5 to 6 by @dependabot in #2592
- chore(deps): Bump chromadb from 0.5.0 to 0.5.4 in /examples/langchain-chroma by @dependabot in #2828
- chore(deps): Bump torch from 2.2.0 to 2.3.1 in /backend/python/mamba by @dependabot in #2835
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2851
- chore(deps): Bump setuptools from 69.5.1 to 70.3.0 in /backend/python/sentencetransformers by @dependabot in #2838
- chore: update edgevpn dependency by @mudler in #2855
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2859
- chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/functions by @dependabot in #2873
- chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #2874
- chore(deps): Bump numexpr from 2.10.0 to 2.10.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #2877
- chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2885
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2886
- chore(deps): Bump debugpy from 1.8.1 to 1.8.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #2878
- chore(deps): Bump langchain-community from 0.2.5 to 0.2.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #2875
- chore(deps): Bump langchain from 0.2.7 to 0.2.8 in /examples/langchain-chroma by @dependabot in #2872
- chore(deps): Bump openai from 1.33.0 to 1.35.13 in /examples/langchain/langchainpy-localai-example by @dependabot in #2876
- chore: ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2898
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2897
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2905
- chore: ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2917
Other Changes
- ci: add pipelines for discord notifications by @mudler in #2703
- ci(arm64): fix gRPC build by adding googletest to CMakefile by @mudler in #2754
- fix: arm builds via disabling abseil tests by @dave-gray101 in #2758
- ci(grpc): disable ABSEIL tests by @mudler in #2759
- ci(deps): add libgmock-dev by @mudler in #2761
- fix abseil test issue [attempt 3] by @dave-gray101 in #2769
- feat(swagger): update swagger by @localai-bot in #2766
- ci: Do not test the full matrix on PRs by @mudler in #2771
- Git fetch specific branch instead of full tree during build by @LoricOSC in #2748
- fix(ci): small fixups to checksum_checker.sh by @mudler in #2776
- fix(ci): fixup correct path for check_and_update.py by @mudler in #2777
- fixes to
check_and_update.py
script by @dave-gray101 in #2778 - Update remaining git clones to git fetch by @LoricOSC in #2779
- feat(scripts): add scripts to help adding new models to the gallery by @mudler in #2789
- build: speedup
git submodule update
with--single-branch
by @dave-gray101 in #2847 - Revert "chore(deps): Bump inflect from 7.0.0 to 7.3.1 in /backend/python/openvoice" by @mudler in #2856
- Revert "chore(deps): Bump librosa from 0.9.1 to 0.10.2.post1 in /backend/python/openvoice" by @mudler in #2861
- feat(swagger): update swagger by @localai-bot in #2858
- Revert "chore(deps): Bump numpy from 1.26.4 to 2.0.0 in /backend/python/openvoice" by @mudler in #2868
- feat(swagger): update swagger by @localai-bot in #2884
- fix: update grpcio version to match version used in builds by @cryptk in #2888
- fix: cleanup indentation and remove duplicate dockerfile stanza by @cryptk in #2889
- ci: add workflow to comment new Opened PRs by @mudler in #2892
- build: fix go.mod - don't import ourself by @dave-gray101 in #2896
- feat(swagger): update swagger by @localai-bot in #2908
- refactor: move federated server logic to its own service by @mudler in #2914
- refactor: groundwork - add pkg/concurrency and the associated test file by @dave-gray101 in #2745
New Contributors
- @a17t made their first contribution in #2720
- @LoricOSC made their first contribution in #2748
- @vaaale made their first contribution in #2893
Full Changelog: v2.18.1...v2.19.0