mudler/LocalAI v2.22.0 on GitHub

LocalAI v2.22.0 is out 🥳

💡 Highlights

Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing.
Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications
Model Management Made Easy: List all your loaded models directly via the /system endpoint for seamless management.
Various bugfixes and improvements: Fixed issues with dangling processes to ensure proper resource management and resolved channel closure issues in the base GRPC server.

🖼️ Multimodal vLLM

To use multimodal models with vLLM simply specify the model in the YAML file. Models however can differ if support multiple images or single images, along how they process internally placeholders for images.

Some models/libraries have different way to express images, videos or audio placeholders. For example, llama.cpp backend expects images within an [img-ID] tag, but other backends/models (e.g. vLLM) use a different notation ( <|image_|>).

For example, to override defaults, now it is possible to set in the model configuration the following:

template:
  video: "<|video_{{.ID}}|> {{.Text}}"
  image: "<|image_{{.ID}}|> {{.Text}}"
  audio: "<|audio_{{.ID}}|> {{.Text}}"

📹 Video and Audio understanding

Some libraries might support both Video and Audio. Currently only vLLM supports Video understanding, and can be used in the API by "extending" the OpenAI API with audio and video type along images:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this video?"
          },
          {
            "type": "video_url",
            "video_url": {
              "url": "https://video-image-url"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

🧑‍🏭 Work in progress

Realtime API is work in progress , tracked in #3714. thumbs up if you want to see it supported in LocalAI!

What's Changed

Bug fixes 🐛

chore: simplify model loading by @mudler in #3715
fix(initializer): correctly reap dangling processes by @mudler in #3717
fix(base-grpc): close channel in base grpc server by @mudler in #3734
fix(vllm): bump cmake - vllm requires it by @mudler in #3744
fix(llama-cpp): consistently select fallback by @mudler in #3789
fix(welcome): do not list model twice if we have a config by @mudler in #3790
fix: listmodelservice / welcome endpoint use LOOSE_ONLY by @dave-gray101 in #3791

Exciting New Features 🎉

feat(api): list loaded models in /system by @mudler in #3661
feat: Add Get Token Metrics to GRPC server by @siddimore in #3687
refactor: ListModels Filtering Upgrade by @dave-gray101 in #2773
feat: track internally started models by ID by @mudler in #3693
feat: tokenization endpoint by @shraddhazpy in #3710
feat(multimodal): allow to template placeholders by @mudler in #3728
feat(vllm): add support for image-to-text and video-to-text by @mudler in #3729
feat(shutdown): allow force shutdown of backends by @mudler in #3733
feat(transformers): Use downloaded model for Transformers backend if it already exists. by @joshbtn in #3777
fix: roll out bluemonday Sanitize more widely by @dave-gray101 in #3794

🧠 Models

models(gallery): add llama-3.2 3B and 1B by @mudler in #3671
chore(model-gallery): ⬆️ update checksum by @localai-bot in #3675
models(gallery): add magnusintellectus-12b-v1-i1 by @mudler in #3678
models(gallery): add bigqwen2.5-52b-instruct by @mudler in #3679
feat(api): add correlationID to Track Chat requests by @siddimore in #3668
models(gallery): add replete-llm-v2.5-qwen-14b by @mudler in #3688
models(gallery): add replete-llm-v2.5-qwen-7b by @mudler in #3689
models(gallery): add calme-2.2-qwen2.5-72b-i1 by @mudler in #3691
models(gallery): add salamandra-7b-instruct by @mudler in #3726
models(gallery): add mn-backyardai-party-12b-v1-iq-arm-imatrix by @mudler in #3740
models(gallery): add t.e-8.1-iq-imatrix-request by @mudler in #3741
models(gallery): add violet_twilight-v0.2-iq-imatrix by @mudler in #3742
models(gallery): add gemma-2-9b-it-abliterated by @mudler in #3743
models(gallery): add moe-girl-1ba-7bt-i1 by @mudler in #3766
models(gallery): add archfunctions models by @mudler in #3767
models(gallery): add versatillama-llama-3.2-3b-instruct-abliterated by @mudler in #3771
models(gallery): add llama3.2-3b-enigma by @mudler in #3772
models(gallery): add llama3.2-3b-esper2 by @mudler in #3773
models(gallery): add llama-3.1-swallow-70b-v0.1-i1 by @mudler in #3774
models(gallery): add rombos-llm-v2.5.1-qwen-3b by @mudler in #3778
models(gallery): add qwen2.5-7b-ins-v3 by @mudler in #3779
models(gallery): add dans-personalityengine-v1.0.0-8b by @mudler in #3780
models(gallery): add llama-3.2-3b-agent007 by @mudler in #3781
models(gallery): add nihappy-l3.1-8b-v0.09 by @mudler in #3782
models(gallery): add llama-3.2-3b-agent007-coder by @mudler in #3783
models(gallery): add fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo by @mudler in #3784
models(gallery): add gemma-2-ataraxy-v3i-9b by @mudler in #3785

📖 Documentation and examples

chore(docs): update CONTRIBUTING.md by @jjasghar in #3723

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp to ea9c32be71b91b42ecc538bd902e93cbb5fb36cb by @localai-bot in #3667
chore: ⬆️ Update ggerganov/whisper.cpp to 69339af2d104802f3f201fd419163defba52890e by @localai-bot in #3666
chore: ⬆️ Update ggerganov/llama.cpp to 95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 by @localai-bot in #3674
chore: ⬆️ Update ggerganov/llama.cpp to b5de3b74a595cbfefab7eeb5a567425c6a9690cf by @localai-bot in #3681
chore: ⬆️ Update ggerganov/whisper.cpp to 8feb375fbdf0277ad36958c218c6bf48fa0ba75a by @localai-bot in #3680
chore: ⬆️ Update ggerganov/llama.cpp to c919d5db39c8a7fcb64737f008e4b105ee0acd20 by @localai-bot in #3686
chore(deps): bump grpcio to 1.66.2 by @mudler in #3690
chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma by @dependabot in #3697
chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma by @dependabot in #3696
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma by @dependabot in #3694
chore: ⬆️ Update ggerganov/llama.cpp to 6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3 by @localai-bot in #3708
chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 by @dependabot in #3698
chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions by @dependabot in #3699
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3704
chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3703
chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions by @dependabot in #3700
chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3702
chore(deps): Bump gradio from 4.38.1 to 4.44.1 in /backend/python/openvoice by @dependabot in #3701
chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/langchain-chroma by @dependabot in #3695
chore(deps): Bump aiohttp from 3.10.3 to 3.10.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3705
chore(deps): Bump yarl from 1.11.1 to 1.13.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3706
chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/chainlit by @dependabot in #3707
chore: ⬆️ Update ggerganov/whisper.cpp to 2ef717b293fe93872cc3a03ca77942936a281959 by @localai-bot in #3712
chore: ⬆️ Update ggerganov/llama.cpp to 3f1ae2e32cde00c39b96be6d01c2997c29bae555 by @localai-bot in #3713
chore: ⬆️ Update ggerganov/llama.cpp to a39ab216aa624308fda7fa84439c6b61dc98b87a by @localai-bot in #3718
chore: ⬆️ Update ggerganov/whisper.cpp to ede1718f6d45aa3f7ad4a1e169dfbc9d51570c4e by @localai-bot in #3719
chore: ⬆️ Update ggerganov/llama.cpp to d5ed2b929d85bbd7dbeecb690880f07d9d7a6077 by @localai-bot in #3725
chore: ⬆️ Update ggerganov/whisper.cpp to ccc2547210e09e3a1785817383ab770389bb442b by @localai-bot in #3724
chore: ⬆️ Update ggerganov/llama.cpp to 71967c2a6d30da9f61580d3e2d4cb00e0223b6fa by @localai-bot in #3731
chore: ⬆️ Update ggerganov/whisper.cpp to 2944cb72d95282378037cb0eb45c9e2b2529ff2c by @localai-bot in #3730
chore: ⬆️ Update ggerganov/whisper.cpp to 6a94163b913d8e974e60d9ac56c8930d19f45773 by @localai-bot in #3735
chore: ⬆️ Update ggerganov/llama.cpp to 8c475b97b8ba7d678d4c9904b1161bd8811a9b44 by @localai-bot in #3736
chore: ⬆️ Update ggerganov/llama.cpp to d5cb86844f26f600c48bf3643738ea68138f961d by @localai-bot in #3738
chore: ⬆️ Update ggerganov/whisper.cpp to 9f346d00840bcd7af62794871109841af40cecfb by @localai-bot in #3739
chore(deps): Bump langchain from 0.3.1 to 0.3.2 in /examples/functions by @dependabot in #3755
chore(deps): Bump openai from 1.50.2 to 1.51.1 in /examples/functions by @dependabot in #3754
chore(deps): Bump openai from 1.45.1 to 1.51.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3748
chore(deps): Bump multidict from 6.0.5 to 6.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3749
chore(deps): Bump aiohttp from 3.10.8 to 3.10.9 in /examples/langchain/langchainpy-localai-example by @dependabot in #3750
chore(deps): Bump llama-index from 0.11.14 to 0.11.16 in /examples/chainlit by @dependabot in #3753
chore(deps): Bump streamlit from 1.38.0 to 1.39.0 in /examples/streamlit-bot by @dependabot in #3757
chore(deps): Bump debugpy from 1.8.2 to 1.8.6 in /examples/langchain/langchainpy-localai-example by @dependabot in #3751
chore(deps): Bump langchain from 0.3.1 to 0.3.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3752
chore(deps): Bump openai from 1.50.2 to 1.51.1 in /examples/langchain-chroma by @dependabot in #3758
chore(deps): Bump llama-index from 0.11.14 to 0.11.16 in /examples/langchain-chroma by @dependabot in #3760
chore(deps): Bump nginx from 1.27.0 to 1.27.2 in /examples/k8sgpt by @dependabot in #3761
chore(deps): Bump appleboy/ssh-action from 1.0.3 to 1.1.0 by @dependabot in #3762
chore: ⬆️ Update ggerganov/llama.cpp to 6374743747b14db4eb73ce82ae449a2978bc3b47 by @localai-bot in #3763
chore: ⬆️ Update ggerganov/whisper.cpp to ebca09a3d1033417b0c630bbbe607b0f185b1488 by @localai-bot in #3764
chore: ⬆️ Update ggerganov/llama.cpp to dca1d4b58a7f1acf1bd253be84e50d6367f492fd by @localai-bot in #3769
chore: ⬆️ Update ggerganov/whisper.cpp to fdbfb460ed546452a5d53611bba66d10d842e719 by @localai-bot in #3768
chore: ⬆️ Update ggerganov/llama.cpp to c81f3bbb051f8b736e117dfc78c99d7c4e0450f6 by @localai-bot in #3775
chore: ⬆️ Update ggerganov/llama.cpp to 0e9f760eb12546704ef8fa72577bc1a3ffe1bc04 by @localai-bot in #3786
chore(deps): bump llama-cpp to 96776405a17034dcfd53d3ddf5d142d34bdbb657 by @mudler in #3793

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3665
feat(swagger): update swagger by @localai-bot in #3664
chore(refactor): track grpcProcess in the model structure by @mudler in #3663
chore: get model also from query by @mudler in #3716
chore(federated): display a message when nodes are not available by @mudler in #3721
chore(vllm): do not install from source by @mudler in #3745
chore(Dockerfile): default to cmake from package manager by @mudler in #3746
chore(tests): improve rwkv tests and consume TEST_FLAKES by @mudler in #3765

New Contributors

@siddimore made their first contribution in #3668
@shraddhazpy made their first contribution in #3710
@jjasghar made their first contribution in #3723
@joshbtn made their first contribution in #3777

Full Changelog: v2.21.1...v2.22.0