Hello folks, Ettore here - I'm happy to announce the v2.13.0 LocalAI release is out, with many features!

Below there is a small breakdown of the hottest features introduced in this release - however - there are many other improvements (especially from the community) as well, so don't miss out the changelog!

Check out the full changelog below for having an overview of all the changes that went in this release (this one is quite packed up).

🖼️ Model gallery

This is the first release with model gallery in the webUI, you can see now a "Model" button in the WebUI which lands now in a selection of models:

You can choose now models between stablediffusion, llama3, tts, embeddings and more! The gallery is growing steadly and being kept up-to-date.

The models are simple YAML files which are hosted in this repository: https://github.com/mudler/LocalAI/tree/master/gallery - you can host your own repository with your model index, or if you want you can contribute to LocalAI.

If you want to contribute adding models, you can by opening up a PR in the gallery directory: https://github.com/mudler/LocalAI/tree/master/gallery.

Rerankers

I'm excited to introduce a new backend for rerankers. LocalAI now implements the Jina API (https://jina.ai/reranker/#apiform) as a compatibility layer, and you can use existing Jina clients and point to those to the LocalAI address. Behind the hoods, uses https://github.com/AnswerDotAI/rerankers.

You can test this by using container images with python (this does NOT work with core images) and a model config file like this, or by installing cross-encoder from the gallery in the UI:

name: jina-reranker-v1-base-en
backend: rerankers
parameters:
  model: cross-encoder

and test it with:

    curl http://localhost:8080/v1/rerank \
      -H "Content-Type: application/json" \
      -d '{
      "model": "jina-reranker-v1-base-en",
      "query": "Organic skincare products for sensitive skin",
      "documents": [
        "Eco-friendly kitchenware for modern homes",
        "Biodegradable cleaning supplies for eco-conscious consumers",
        "Organic cotton baby clothes for sensitive skin",
        "Natural organic skincare range for sensitive skin",
        "Tech gadgets for smart homes: 2024 edition",
        "Sustainable gardening tools and compost solutions",
        "Sensitive skin-friendly facial cleansers and toners",
        "Organic food wraps and storage solutions",
        "All-natural pet food for dogs with allergies",
        "Yoga mats made from recycled materials"
      ],
      "top_n": 3
    }'

Parler-tts

There is a new backend available for tts now, parler-tts. It is possible to install and configure the model directly from the gallery. https://github.com/huggingface/parler-tts

🎈 Lot of small improvements behind the scenes!

Thanks to our outstanding community, we have enhanced the performance and stability of LocalAI across various modules. From backend optimizations to front-end adjustments, every tweak helps make LocalAI smoother and more robust.

📣 Spread the word!

First off, a massive thank you (again!) to each and every one of you who've chipped in to squash bugs and suggest cool new features for LocalAI. Your help, kind words, and brilliant ideas are truly appreciated - more than words can say!

And to those of you who've been heros, giving up your own time to help out fellow users on Discord and in our repo, you're absolutely amazing. We couldn't have asked for a better community.

Just so you know, LocalAI doesn't have the luxury of big corporate sponsors behind it. It's all us, folks. So, if you've found value in what we're building together and want to keep the momentum going, consider showing your support. A little shoutout on your favorite social platforms using @LocalAI_OSS and @mudler_it or joining our sponsors can make a big difference.

Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy

Every bit of support, every mention, and every star adds up and helps us keep this ship sailing. Let's keep making LocalAI awesome together!

Thanks a ton, and here's to more exciting times ahead with LocalAI!

What's Changed

Bug fixes 🐛

fix(autogptq): do not use_triton with qwen-vl by @thiner in #1985
fix: respect concurrency from parent build parameters when building GRPC by @cryptk in #2023
ci: fix release pipeline missing dependencies by @mudler in #2025
fix: remove build path from help text documentation by @cryptk in #2037
fix: previous CLI rework broke debug logging by @cryptk in #2036
fix(fncall): fix regression introduced in #1963 by @mudler in #2048
fix: adjust some sources names to match the naming of their repositories by @cryptk in #2061
fix: move the GRPC cache generation workflow into it's own concurrency group by @cryptk in #2071
fix(llama.cpp): set -1 as default for max tokens by @mudler in #2087
fix(llama.cpp-ggml): fixup max_tokens for old backend by @mudler in #2094
fix missing TrustRemoteCode in OpenVINO model load by @fakezeta in #2114
Incl ocv pkg for diffsusers utils by @jtwolfe in #2115

Exciting New Features 🎉

feat: kong cli refactor fixes #1955 by @cryptk in #1974
feat: add flash-attn in nvidia and rocm envs by @golgeek in #1995
feat: use tokenizer.apply_chat_template() in vLLM by @golgeek in #1990
feat(gallery): support ConfigURLs by @mudler in #2012
fix: dont commit generated files to git by @cryptk in #1993
feat(parler-tts): Add new backend by @mudler in #2027
feat(grpc): return consumed token count and update response accordingly by @mudler in #2035
feat(store): add Golang client by @mudler in #1977
feat(functions): support models with no grammar, add tests by @mudler in #2068
refactor(template): isolate and add tests by @mudler in #2069
feat: fiber logs with zerlog and add trace level by @cryptk in #2082
models(gallery): add gallery by @mudler in #2078
Add tensor_parallel_size setting to vllm setting items by @Taikono-Himazin in #2085
Transformer Backend: Implementing use_tokenizer_template and stop_prompts options by @fakezeta in #2090
feat: Galleries UI by @mudler in #2104
Transformers Backend: max_tokens adherence to OpenAI API by @fakezeta in #2108
Fix cleanup sonarqube findings by @cryptk in #2106
feat(models-ui): minor visual enhancements by @mudler in #2109
fix(gallery): show a fake image if no there is no icon by @mudler in #2111
feat(rerankers): Add new backend, support jina rerankers API by @mudler in #2121

🧠 Models

models(llama3): add llama3 to embedded models by @mudler in #2074
feat(gallery): add llama3, hermes, phi-3, and others by @mudler in #2110
models(gallery): add new models to the gallery by @mudler in #2124
models(gallery): add more models by @mudler in #2129

📖 Documentation and examples

⬆️ Update docs version mudler/LocalAI by @localai-bot in #1988
docs: fix stores link by @adrienbrault in #2044
AMD/ROCm Documentation update + formatting fix by @jtwolfe in #2100

👒 Dependencies

deps: Update version of vLLM to add support of Cohere Command_R model in vLLM inference by @holyCowMp3 in #1975
⬆️ Update ggerganov/llama.cpp by @localai-bot in #1991
build(deps): bump google.golang.org/protobuf from 1.31.0 to 1.33.0 by @dependabot in #1998
build(deps): bump github.com/docker/docker from 20.10.7+incompatible to 24.0.9+incompatible by @dependabot in #1999
build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.1 by @dependabot in #2001
build(deps): bump actions/checkout from 3 to 4 by @dependabot in #2002
build(deps): bump actions/setup-go from 4 to 5 by @dependabot in #2003
build(deps): bump peter-evans/create-pull-request from 5 to 6 by @dependabot in #2005
build(deps): bump actions/cache from 3 to 4 by @dependabot in #2006
build(deps): bump actions/upload-artifact from 3 to 4 by @dependabot in #2007
build(deps): bump github.com/charmbracelet/glamour from 0.6.0 to 0.7.0 by @dependabot in #2004
build(deps): bump github.com/gofiber/fiber/v2 from 2.52.0 to 2.52.4 by @dependabot in #2008
build(deps): bump github.com/opencontainers/runc from 1.1.5 to 1.1.12 by @dependabot in #2000
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2014
build(deps): bump the pip group across 4 directories with 8 updates by @dependabot in #2017
build(deps): bump follow-redirects from 1.15.2 to 1.15.6 in /examples/langchain/langchainjs-localai-example by @dependabot in #2020
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2024
build(deps): bump softprops/action-gh-release from 1 to 2 by @dependabot in #2039
build(deps): bump dependabot/fetch-metadata from 1.3.4 to 2.0.0 by @dependabot in #2040
build(deps): bump github/codeql-action from 2 to 3 by @dependabot in #2041
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2043
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2042
build(deps): bump the pip group across 4 directories with 8 updates by @dependabot in #2049
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2050
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2060
build(deps): bump aiohttp from 3.9.2 to 3.9.4 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory by @dependabot in #2067
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2089
deps(llama.cpp): update, use better model for function call tests by @mudler in #2119
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2122
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2123
build(deps): bump pydantic from 1.10.7 to 1.10.13 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory by @dependabot in #2125
feat(swagger): update swagger by @localai-bot in #2128

Other Changes

ci: try to build on macos14 by @mudler in #2011
⬆️ Update docs version mudler/LocalAI by @localai-bot in #2013
refactor: backend/service split, channel-based llm flow by @dave-gray101 in #1963
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2028
fix - correct checkout versions by @dave-gray101 in #2029
Revert "build(deps): bump the pip group across 4 directories with 8 updates" by @mudler in #2030
⬆️ Update docs version mudler/LocalAI by @localai-bot in #2032
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2033
fix: action-tmate back to upstream, dead code removal by @dave-gray101 in #2038
Revert #1963 by @mudler in #2056
feat: refactor the dynamic json configs for api_keys and external_backends by @cryptk in #2055
tests: add template tests by @mudler in #2063
feat: better control of GRPC docker cache by @cryptk in #2070
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2051
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2080
feat: enable polling configs for systems with broken fsnotify (docker volumes on windows) by @cryptk in #2081
fix: action-tmate: use connect-timeout-sections and limit-access-to-actor by @dave-gray101 in #2083
refactor(routes): split routes registration by @mudler in #2077
fix: action-tmate detached by @dave-gray101 in #2092
fix: rename fiber entrypoint from http/api to http/app by @mudler in #2096
fix: typo in models.go by @eltociear in #2099
Update text-generation.md by @Taikono-Himazin in #2095
⬆️ Update docs version mudler/LocalAI by @localai-bot in #2105
⬆️ Update docs version mudler/LocalAI by @localai-bot in #2113

New Contributors

@holyCowMp3 made their first contribution in #1975
@dependabot made their first contribution in #1998
@adrienbrault made their first contribution in #2044
@Taikono-Himazin made their first contribution in #2085
@eltociear made their first contribution in #2099
@jtwolfe made their first contribution in #2100

Full Changelog: v2.12.4...V2.13.0

mudler/LocalAI v2.13.0 🖼️ v2.13.0 - Model gallery edition on GitHub