github mudler/LocalAI v2.22.0

12 hours ago

LocalAI v2.22.0 is out 🥳

💡 Highlights

  • Image-to-Text and Video-to-Text Support: The VLLM backend now supports both image-to-text and video-to-text processing.
  • Enhanced Multimodal Support: Template placeholders are now available, offering more flexibility in multimodal applications
  • Model Management Made Easy: List all your loaded models directly via the /system endpoint for seamless management.
  • Various bugfixes and improvements: Fixed issues with dangling processes to ensure proper resource management and resolved channel closure issues in the base GRPC server.

🖼️ Multimodal vLLM

To use multimodal models with vLLM simply specify the model in the YAML file. Models however can differ if support multiple images or single images, along how they process internally placeholders for images.

Some models/libraries have different way to express images, videos or audio placeholders. For example, llama.cpp backend expects images within an [img-ID] tag, but other backends/models (e.g. vLLM) use a different notation ( <|image_|>).

For example, to override defaults, now it is possible to set in the model configuration the following:

template:
  video: "<|video_{{.ID}}|> {{.Text}}"
  image: "<|image_{{.ID}}|> {{.Text}}"
  audio: "<|audio_{{.ID}}|> {{.Text}}"

📹 Video and Audio understanding

Some libraries might support both Video and Audio. Currently only vLLM supports Video understanding, and can be used in the API by "extending" the OpenAI API with audio and video type along images:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What'\''s in this video?"
          },
          {
            "type": "video_url",
            "video_url": {
              "url": "https://video-image-url"
            }
          }
        ]
      }
    ],
    "max_tokens": 300
  }'

🧑‍🏭 Work in progress

  • Realtime API is work in progress , tracked in #3714. thumbs up if you want to see it supported in LocalAI!

What's Changed

Bug fixes 🐛

  • chore: simplify model loading by @mudler in #3715
  • fix(initializer): correctly reap dangling processes by @mudler in #3717
  • fix(base-grpc): close channel in base grpc server by @mudler in #3734
  • fix(vllm): bump cmake - vllm requires it by @mudler in #3744
  • fix(llama-cpp): consistently select fallback by @mudler in #3789
  • fix(welcome): do not list model twice if we have a config by @mudler in #3790
  • fix: listmodelservice / welcome endpoint use LOOSE_ONLY by @dave-gray101 in #3791

Exciting New Features 🎉

  • feat(api): list loaded models in /system by @mudler in #3661
  • feat: Add Get Token Metrics to GRPC server by @siddimore in #3687
  • refactor: ListModels Filtering Upgrade by @dave-gray101 in #2773
  • feat: track internally started models by ID by @mudler in #3693
  • feat: tokenization endpoint by @shraddhazpy in #3710
  • feat(multimodal): allow to template placeholders by @mudler in #3728
  • feat(vllm): add support for image-to-text and video-to-text by @mudler in #3729
  • feat(shutdown): allow force shutdown of backends by @mudler in #3733
  • feat(transformers): Use downloaded model for Transformers backend if it already exists. by @joshbtn in #3777
  • fix: roll out bluemonday Sanitize more widely by @dave-gray101 in #3794

🧠 Models

  • models(gallery): add llama-3.2 3B and 1B by @mudler in #3671
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3675
  • models(gallery): add magnusintellectus-12b-v1-i1 by @mudler in #3678
  • models(gallery): add bigqwen2.5-52b-instruct by @mudler in #3679
  • feat(api): add correlationID to Track Chat requests by @siddimore in #3668
  • models(gallery): add replete-llm-v2.5-qwen-14b by @mudler in #3688
  • models(gallery): add replete-llm-v2.5-qwen-7b by @mudler in #3689
  • models(gallery): add calme-2.2-qwen2.5-72b-i1 by @mudler in #3691
  • models(gallery): add salamandra-7b-instruct by @mudler in #3726
  • models(gallery): add mn-backyardai-party-12b-v1-iq-arm-imatrix by @mudler in #3740
  • models(gallery): add t.e-8.1-iq-imatrix-request by @mudler in #3741
  • models(gallery): add violet_twilight-v0.2-iq-imatrix by @mudler in #3742
  • models(gallery): add gemma-2-9b-it-abliterated by @mudler in #3743
  • models(gallery): add moe-girl-1ba-7bt-i1 by @mudler in #3766
  • models(gallery): add archfunctions models by @mudler in #3767
  • models(gallery): add versatillama-llama-3.2-3b-instruct-abliterated by @mudler in #3771
  • models(gallery): add llama3.2-3b-enigma by @mudler in #3772
  • models(gallery): add llama3.2-3b-esper2 by @mudler in #3773
  • models(gallery): add llama-3.1-swallow-70b-v0.1-i1 by @mudler in #3774
  • models(gallery): add rombos-llm-v2.5.1-qwen-3b by @mudler in #3778
  • models(gallery): add qwen2.5-7b-ins-v3 by @mudler in #3779
  • models(gallery): add dans-personalityengine-v1.0.0-8b by @mudler in #3780
  • models(gallery): add llama-3.2-3b-agent007 by @mudler in #3781
  • models(gallery): add nihappy-l3.1-8b-v0.09 by @mudler in #3782
  • models(gallery): add llama-3.2-3b-agent007-coder by @mudler in #3783
  • models(gallery): add fireball-meta-llama-3.2-8b-instruct-agent-003-128k-code-dpo by @mudler in #3784
  • models(gallery): add gemma-2-ataraxy-v3i-9b by @mudler in #3785

📖 Documentation and examples

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to ea9c32be71b91b42ecc538bd902e93cbb5fb36cb by @localai-bot in #3667
  • chore: ⬆️ Update ggerganov/whisper.cpp to 69339af2d104802f3f201fd419163defba52890e by @localai-bot in #3666
  • chore: ⬆️ Update ggerganov/llama.cpp to 95bc82fbc0df6d48cf66c857a4dda3d044f45ca2 by @localai-bot in #3674
  • chore: ⬆️ Update ggerganov/llama.cpp to b5de3b74a595cbfefab7eeb5a567425c6a9690cf by @localai-bot in #3681
  • chore: ⬆️ Update ggerganov/whisper.cpp to 8feb375fbdf0277ad36958c218c6bf48fa0ba75a by @localai-bot in #3680
  • chore: ⬆️ Update ggerganov/llama.cpp to c919d5db39c8a7fcb64737f008e4b105ee0acd20 by @localai-bot in #3686
  • chore(deps): bump grpcio to 1.66.2 by @mudler in #3690
  • chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/langchain-chroma by @dependabot in #3697
  • chore(deps): Bump chromadb from 0.5.7 to 0.5.11 in /examples/langchain-chroma by @dependabot in #3696
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain-chroma by @dependabot in #3694
  • chore: ⬆️ Update ggerganov/llama.cpp to 6f1d9d71f4c568778a7637ff6582e6f6ba5fb9d3 by @localai-bot in #3708
  • chore(deps): Bump securego/gosec from 2.21.0 to 2.21.4 by @dependabot in #3698
  • chore(deps): Bump openai from 1.47.1 to 1.50.2 in /examples/functions by @dependabot in #3699
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3704
  • chore(deps): Bump greenlet from 3.1.0 to 3.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3703
  • chore(deps): Bump langchain from 0.3.0 to 0.3.1 in /examples/functions by @dependabot in #3700
  • chore(deps): Bump langchain-community from 0.2.16 to 0.3.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3702
  • chore(deps): Bump gradio from 4.38.1 to 4.44.1 in /backend/python/openvoice by @dependabot in #3701
  • chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/langchain-chroma by @dependabot in #3695
  • chore(deps): Bump aiohttp from 3.10.3 to 3.10.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3705
  • chore(deps): Bump yarl from 1.11.1 to 1.13.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3706
  • chore(deps): Bump llama-index from 0.11.12 to 0.11.14 in /examples/chainlit by @dependabot in #3707
  • chore: ⬆️ Update ggerganov/whisper.cpp to 2ef717b293fe93872cc3a03ca77942936a281959 by @localai-bot in #3712
  • chore: ⬆️ Update ggerganov/llama.cpp to 3f1ae2e32cde00c39b96be6d01c2997c29bae555 by @localai-bot in #3713
  • chore: ⬆️ Update ggerganov/llama.cpp to a39ab216aa624308fda7fa84439c6b61dc98b87a by @localai-bot in #3718
  • chore: ⬆️ Update ggerganov/whisper.cpp to ede1718f6d45aa3f7ad4a1e169dfbc9d51570c4e by @localai-bot in #3719
  • chore: ⬆️ Update ggerganov/llama.cpp to d5ed2b929d85bbd7dbeecb690880f07d9d7a6077 by @localai-bot in #3725
  • chore: ⬆️ Update ggerganov/whisper.cpp to ccc2547210e09e3a1785817383ab770389bb442b by @localai-bot in #3724
  • chore: ⬆️ Update ggerganov/llama.cpp to 71967c2a6d30da9f61580d3e2d4cb00e0223b6fa by @localai-bot in #3731
  • chore: ⬆️ Update ggerganov/whisper.cpp to 2944cb72d95282378037cb0eb45c9e2b2529ff2c by @localai-bot in #3730
  • chore: ⬆️ Update ggerganov/whisper.cpp to 6a94163b913d8e974e60d9ac56c8930d19f45773 by @localai-bot in #3735
  • chore: ⬆️ Update ggerganov/llama.cpp to 8c475b97b8ba7d678d4c9904b1161bd8811a9b44 by @localai-bot in #3736
  • chore: ⬆️ Update ggerganov/llama.cpp to d5cb86844f26f600c48bf3643738ea68138f961d by @localai-bot in #3738
  • chore: ⬆️ Update ggerganov/whisper.cpp to 9f346d00840bcd7af62794871109841af40cecfb by @localai-bot in #3739
  • chore(deps): Bump langchain from 0.3.1 to 0.3.2 in /examples/functions by @dependabot in #3755
  • chore(deps): Bump openai from 1.50.2 to 1.51.1 in /examples/functions by @dependabot in #3754
  • chore(deps): Bump openai from 1.45.1 to 1.51.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3748
  • chore(deps): Bump multidict from 6.0.5 to 6.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3749
  • chore(deps): Bump aiohttp from 3.10.8 to 3.10.9 in /examples/langchain/langchainpy-localai-example by @dependabot in #3750
  • chore(deps): Bump llama-index from 0.11.14 to 0.11.16 in /examples/chainlit by @dependabot in #3753
  • chore(deps): Bump streamlit from 1.38.0 to 1.39.0 in /examples/streamlit-bot by @dependabot in #3757
  • chore(deps): Bump debugpy from 1.8.2 to 1.8.6 in /examples/langchain/langchainpy-localai-example by @dependabot in #3751
  • chore(deps): Bump langchain from 0.3.1 to 0.3.2 in /examples/langchain/langchainpy-localai-example by @dependabot in #3752
  • chore(deps): Bump openai from 1.50.2 to 1.51.1 in /examples/langchain-chroma by @dependabot in #3758
  • chore(deps): Bump llama-index from 0.11.14 to 0.11.16 in /examples/langchain-chroma by @dependabot in #3760
  • chore(deps): Bump nginx from 1.27.0 to 1.27.2 in /examples/k8sgpt by @dependabot in #3761
  • chore(deps): Bump appleboy/ssh-action from 1.0.3 to 1.1.0 by @dependabot in #3762
  • chore: ⬆️ Update ggerganov/llama.cpp to 6374743747b14db4eb73ce82ae449a2978bc3b47 by @localai-bot in #3763
  • chore: ⬆️ Update ggerganov/whisper.cpp to ebca09a3d1033417b0c630bbbe607b0f185b1488 by @localai-bot in #3764
  • chore: ⬆️ Update ggerganov/llama.cpp to dca1d4b58a7f1acf1bd253be84e50d6367f492fd by @localai-bot in #3769
  • chore: ⬆️ Update ggerganov/whisper.cpp to fdbfb460ed546452a5d53611bba66d10d842e719 by @localai-bot in #3768
  • chore: ⬆️ Update ggerganov/llama.cpp to c81f3bbb051f8b736e117dfc78c99d7c4e0450f6 by @localai-bot in #3775
  • chore: ⬆️ Update ggerganov/llama.cpp to 0e9f760eb12546704ef8fa72577bc1a3ffe1bc04 by @localai-bot in #3786
  • chore(deps): bump llama-cpp to 96776405a17034dcfd53d3ddf5d142d34bdbb657 by @mudler in #3793

Other Changes

  • docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #3665
  • feat(swagger): update swagger by @localai-bot in #3664
  • chore(refactor): track grpcProcess in the model structure by @mudler in #3663
  • chore: get model also from query by @mudler in #3716
  • chore(federated): display a message when nodes are not available by @mudler in #3721
  • chore(vllm): do not install from source by @mudler in #3745
  • chore(Dockerfile): default to cmake from package manager by @mudler in #3746
  • chore(tests): improve rwkv tests and consume TEST_FLAKES by @mudler in #3765

New Contributors

Full Changelog: v2.21.1...v2.22.0

Don't miss a new LocalAI release

NewReleases is sending notifications on new releases.