github mudler/LocalAI v2.21.0

6 hours ago

💡 Highlights!

LocalAI v2.21 release is out!

  • Deprecation of the exllama backend
  • AIO images now have gpt-4o instead of gpt-4-vision-preview for Vision API
  • vLLM backend now supports embeddings
  • New endpoint to list system information (/system)
  • trust_remote_code is now respected by sentencetransformers
  • Auto warm-up and load models on start
  • coqui backend switched to the community-maintained fork

What's Changed

Breaking Changes 🛠

  • chore(exllama): drop exllama backend by @mudler in #3536
  • chore(aio): rename gpt-4-vision-preview to gpt-4o by @mudler in #3597

Exciting New Features 🎉

  • feat: elevenlabs sound-generation api by @dave-gray101 in #3355
  • feat(vllm): add support for embeddings by @mudler in #3440
  • feat: add endpoint to list system informations by @mudler in #3449
  • feat: extract output with regexes from LLMs by @mudler in #3491
  • feat: allow setting trust_remote_code for sentencetransformers backend by @Nyralei in #3552
  • feat(api): allow to pass videos to backends by @mudler in #3601
  • feat(api): allow to pass audios to backends by @mudler in #3603
  • feat: auto load into memory on startup by @sozercan in #3627
  • feat(coqui): switch to maintained community fork by @mudler in #3625

Bug fixes 🐛

  • fix(p2p): correctly allow to pass extra args to llama.cpp by @mudler in #3368
  • fix(model-loading): keep track of open GRPC Clients by @mudler in #3377
  • fix(tts): check error before inspecting result by @mudler in #3415
  • fix(shutdown): do not shutdown immediately busy backends by @mudler in #3543
  • fix(parler-tts): fix install with sycl by @mudler in #3624
  • fix(ci): fixup checksum scanning pipeline by @mudler in #3631
  • fix(hipblas): do not push all variants to hipblas builds by @mudler in #3630

🧠 Models

  • chore(model-gallery): add more quants for popular models by @mudler in #3365
  • models(gallery): add phi-3.5 by @mudler in #3376
  • models(gallery): add calme-2.1-phi3.5-4b-i1 by @mudler in #3383
  • models(gallery): add magnum-v3-34b by @mudler in #3384
  • models(gallery): add phi-3.5-vision by @mudler in #3421
  • Revert "models(gallery): add phi-3.5-vision" by @mudler in #3422
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3425
  • feat: Added Piper voice it-paola-medium by @fakezeta in #3434
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3442
  • models(gallery): add hubble-4b-v1 by @mudler in #3444
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3446
  • models(gallery): add yi-coder (and variants) by @mudler in #3482
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #3486
  • models(gallery): add reflection-llama-3.1-70b by @mudler in #3487
  • models(gallery): add athena-codegemma-2-2b-it by @mudler in #3490
  • models(gallery): add azure_dusk-v0.2-iq-imatrix by @mudler in #3538
  • models(gallery): add mn-12b-lyra-v4-iq-imatrix by @mudler in #3539
  • models(gallery): add datagemma models by @mudler in #3540
  • models(gallery): add l3.1-8b-niitama-v1.1-iq-imatrix by @mudler in #3550
  • models(gallery): add llama-3.1-8b-stheno-v3.4-iq-imatrix by @mudler in #3551
  • fix: gallery/index.yaml comment spacing by @dave-gray101 in #3585
  • models(gallery): add qwen2.5-14b-instruct by @mudler in #3607
  • models(gallery): add qwen2.5-math-7b-instruct by @mudler in #3609
  • models(gallery): add qwen2.5-14b_uncencored by @mudler in #3610
  • models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #3611
  • models(gallery): add qwen2.5-math-72b-instruct by @mudler in #3612
  • models(gallery): add qwen2.5-0.5b-instruct, qwen2.5-1.5b-instruct by @mudler in #3613
  • models(gallery): add qwen2.5 32B, 72B, 32B Instruct by @mudler in #3614
  • models(gallery): add llama-3.1-supernova-lite-reflection-v1.0-i1 by @mudler in #3615
  • models(gallery): add llama-3.1-supernova-lite by @mudler in #3616
  • models(gallery): add llama3.1-8b-shiningvaliant2 by @mudler in #3617
  • models(gallery): add buddy2 by @mudler in #3618
  • models(gallery): add llama-3.1-8b-arliai-rpmax-v1.1 by @mudler in #3619
  • Fix NeuralDaredevil URL by @nyx4ris in #3621
  • models(gallery): add nightygurps-14b-v1.1 by @mudler in #3633
  • models(gallery): add gemma-2-9b-arliai-rpmax-v1.1 by @mudler in #3634
  • models(gallery): add gemma-2-2b-arliai-rpmax-v1.1 by @mudler in #3635
  • models(gallery): add acolyte-22b-i1 by @mudler in #3636

📖 Documentation and examples

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to 3ba780e2a8f0ffe13f571b27f0bbf2ca5a199efc by @localai-bot in #3361
  • chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/functions by @dependabot in #3390
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 82a5e98 to 3a0ae52 by @dependabot in #3391
  • chore(deps): Bump idna from 3.7 to 3.8 in /examples/langchain/langchainpy-localai-example by @dependabot in #3399
  • chore(deps): Bump llama-index from 0.10.65 to 0.11.1 in /examples/chainlit by @dependabot in #3404
  • chore(deps): Bump llama-index from 0.10.67.post1 to 0.11.1 in /examples/langchain-chroma by @dependabot in #3406
  • chore(deps): Bump marshmallow from 3.21.3 to 3.22.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3400
  • chore(deps): Bump openai from 1.40.5 to 1.42.0 in /examples/langchain-chroma by @dependabot in #3405
  • chore(deps): Bump openai from 1.41.1 to 1.42.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3401
  • chore(deps): update edgevpn to v0.28 by @mudler in #3412
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/functions by @dependabot in #3453
  • chore(deps): Bump certifi from 2024.7.4 to 2024.8.30 in /examples/langchain/langchainpy-localai-example by @dependabot in #3457
  • chore(deps): Bump yarl from 1.9.4 to 1.9.7 in /examples/langchain/langchainpy-localai-example by @dependabot in #3459
  • chore(deps): Bump langchain-community from 0.2.12 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3461
  • chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/chainlit by @dependabot in #3462
  • chore(deps): Bump llama-index from 0.11.1 to 0.11.4 in /examples/langchain-chroma by @dependabot in #3467
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 3a0ae52 to 550a6ee by @dependabot in #3472
  • chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/functions by @dependabot in #3452
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain/langchainpy-localai-example by @dependabot in #3460
  • chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain-chroma by @dependabot in #3468
  • chore(deps): Bump langchain from 0.2.14 to 0.2.15 in /examples/langchain-chroma by @dependabot in #3466
  • chore(deps): Bump streamlit from 1.37.1 to 1.38.0 in /examples/streamlit-bot by @dependabot in #3465
  • chore(deps): Bump openai from 1.42.0 to 1.43.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3456
  • chore(deps): Bump langchain-community from 0.2.15 to 0.2.16 in /examples/langchain/langchainpy-localai-example by @dependabot in #3500
  • chore(deps): Bump openai from 1.43.0 to 1.44.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3504
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 550a6ee to f696f60 by @dependabot in #3505
  • chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/langchain-chroma by @dependabot in #3507
  • chore(deps): Bump peter-evans/create-pull-request from 6 to 7 by @dependabot in #3518
  • chore(deps): Bump openai from 1.43.0 to 1.44.0 in /examples/functions by @dependabot in #3522
  • chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/langchain/langchainpy-localai-example by @dependabot in #3502
  • chore(deps): Bump numpy from 2.1.0 to 2.1.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3503
  • chore(deps): Bump llama-index from 0.11.4 to 0.11.7 in /examples/langchain-chroma by @dependabot in #3508
  • chore(deps): Bump langchain from 0.2.15 to 0.2.16 in /examples/functions by @dependabot in #3521
  • chore(deps): Bump openai from 1.43.0 to 1.44.1 in /examples/langchain-chroma by @dependabot in #3532
  • chore(deps): Bump yarl from 1.9.7 to 1.11.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3501
  • chore(deps): Bump llama-index from 0.11.4 to 0.11.7 in /examples/chainlit by @dependabot in #3516
  • chore(deps): update llama.cpp to 6262d13e0b2da91f230129a93a996609a2fa2f2 by @mudler in #3549
  • chore(deps): Bump docs/themes/hugo-theme-relearn from f696f60 to d5a0ee0 by @dependabot in #3558
  • chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/coqui by @dependabot in #3554
  • chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/functions by @dependabot in #3559
  • chore(deps): Bump openai from 1.44.1 to 1.45.1 in /examples/langchain-chroma by @dependabot in #3556
  • chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/autogptq by @dependabot in #3553
  • chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2 by @dependabot in #3561
  • chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers-musicgen by @dependabot in #3564
  • chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/parler-tts by @dependabot in #3565
  • chore(deps): Bump sentence-transformers from 3.0.1 to 3.1.0 in /backend/python/sentencetransformers by @dependabot in #3566
  • chore(deps): Bump llama-index from 0.11.7 to 0.11.9 in /examples/chainlit by @dependabot in #3567
  • chore(deps): Bump weaviate-client from 4.6.7 to 4.8.1 in /examples/chainlit by @dependabot in #3568
  • chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/vall-e-x by @dependabot in #3570
  • chore(deps): Bump greenlet from 3.0.3 to 3.1.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3571
  • chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/diffusers by @dependabot in #3575
  • chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/bark by @dependabot in #3574
  • chore(deps): Bump setuptools from 72.1.0 to 75.1.0 in /backend/python/rerankers by @dependabot in #3578
  • chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers by @dependabot in #3579
  • chore(deps): Bump setuptools from 70.3.0 to 75.1.0 in /backend/python/vllm by @dependabot in #3580
  • chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain-chroma by @dependabot in #3557
  • chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/functions by @dependabot in #3560
  • chore(deps): Bump langchain from 0.2.16 to 0.3.0 in /examples/langchain/langchainpy-localai-example by @dependabot in #3577
  • chore(deps): Bump openai from 1.44.0 to 1.45.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3573
  • chore(deps): Bump pypinyin from 0.50.0 to 0.53.0 in /backend/python/openvoice by @dependabot in #3562
  • chore(deps): Bump yarl from 1.11.0 to 1.11.1 in /examples/langchain/langchainpy-localai-example by @dependabot in #3643
  • chore(deps): Bump urllib3 from 2.2.2 to 2.2.3 in /examples/langchain/langchainpy-localai-example by @dependabot in #3646
  • chore(deps): Bump idna from 3.8 to 3.10 in /examples/langchain/langchainpy-localai-example by @dependabot in #3644
  • chore(deps): Bump sqlalchemy from 2.0.32 to 2.0.35 in /examples/langchain/langchainpy-localai-example by @dependabot in #3649

Other Changes

  • feat: external backend launching log improvements and relative path support by @dave-gray101 in #3348
  • Update quickstart.md by @grant-wilson in #3373
  • feat(swagger): update swagger by @localai-bot in #3370
  • fix: devcontainer utils.sh ssh copy improvements by @dave-gray101 in #3372
  • chore(cuda): reduce binary size by @mudler in #3379
  • chore(deps): update edgevpn by @mudler in #3385
  • chore: ⬆️ Update ggerganov/llama.cpp to 7d787ed96c32be18603c158ab0276992cf0dc346 by @localai-bot in #3409
  • chore: ⬆️ Update ggerganov/llama.cpp to 20f1789dfb4e535d64ba2f523c64929e7891f428 by @localai-bot in #3417
  • chore: ⬆️ Update ggerganov/llama.cpp to 9fe94ccac92693d4ae1bc283ff0574e8b3f4e765 by @localai-bot in #3424
  • chore(cli): be consistent between workers and expose ExtraLLamaCPPArgs to both by @mudler in #3428
  • chore(tests): replace runaway models for tests by @mudler in #3432
  • chore(model-loader): increase test coverage of model loader by @mudler in #3433
  • chore(deps): update llama.cpp by @mudler in #3438
  • chore: ⬆️ Update ggerganov/llama.cpp to a47667cff41f5a198eb791974e0afcc1cddd3229 by @localai-bot in #3441
  • chore: ⬆️ Update ggerganov/llama.cpp to 8f1d81a0b6f50b9bad72db0b6fcd299ad9ecd48c by @localai-bot in #3445
  • fix: untangle pkg/grpc and core/schema for Transcription by @dave-gray101 in #3419
  • chore(deps): update whisper.cpp by @mudler in #3443
  • chore: ⬆️ Update ggerganov/llama.cpp to 48baa61eccdca9205daf8d620ba28055c2347b64 by @localai-bot in #3474
  • chore: ⬆️ Update ggerganov/whisper.cpp to 5236f0278420ab776d1787c4330678d80219b4b6 by @localai-bot in #3475
  • chore: ⬆️ Update ggerganov/llama.cpp to 8962422b1c6f9b8b15f5aeaea42600bcc2d44177 by @localai-bot in #3478
  • fix: purge a few remaining runway model references by @dave-gray101 in #3480
  • chore: ⬆️ Update ggerganov/llama.cpp to 581c305186a0ff93f360346c57e21fe16e967bb7 by @localai-bot in #3481
  • chore: ⬆️ Update ggerganov/llama.cpp to 4db04784f96757d74f74c8c110c2a00d55e33514 by @localai-bot in #3485
  • feat(swagger): update swagger by @localai-bot in #3484
  • chore: ⬆️ Update ggerganov/llama.cpp to 815b1fb20a53e439882171757825bacb1350de04 by @localai-bot in #3489
  • chore: ⬆️ Update ggerganov/whisper.cpp to 5caa19240d55bfd6ee316d50fbad32c6e9c39528 by @localai-bot in #3494
  • fix: speedup and improve cachability of docker build of builder-sd by @dave-gray101 in #3430
  • chore: ⬆️ Update ggerganov/whisper.cpp to a551933542d956ae84634937acd2942eb40efaaf by @localai-bot in #3534
  • chore(deps): update llama.cpp by @mudler in #3497
  • chore(gosec): fix CI by @mudler in #3537
  • chore: ⬆️ Update ggerganov/llama.cpp to feff4aa8461da7c432d144c11da4802e41fef3cf by @localai-bot in #3542
  • chore: ⬆️ Update ggerganov/whisper.cpp to 049b3a0e53c8a8e4c4576c06a1a4fccf0063a73f by @localai-bot in #3548
  • feat: auth v2 - supersedes #2894 by @dave-gray101 in #3476
  • chore: ⬆️ Update ggerganov/llama.cpp to 23e0d70bacaaca1429d365a44aa9e7434f17823b by @localai-bot in #3581
  • Revert "chore(deps): Bump setuptools from 69.5.1 to 75.1.0 in /backend/python/transformers" by @mudler in #3586
  • chore(refactor): drop duplicated shutdown logics by @mudler in #3589
  • Revert "chore(deps): Bump securego/gosec from 2.21.0 to 2.21.2" by @mudler in #3590
  • chore: ⬆️ Update ggerganov/llama.cpp to 8b836ae731bbb2c5640bc47df5b0a78ffcb129cb by @localai-bot in #3591
  • chore: ⬆️ Update ggerganov/whisper.cpp to 5b1ce40fa882e9cb8630b48032067a1ed2f1534f by @localai-bot in #3592
  • chore: ⬆️ Update ggerganov/llama.cpp to 64c6af3195c3cd4aa3328a1282d29cd2635c34c9 by @localai-bot in #3598
  • feat(swagger): update swagger by @localai-bot in #3604
  • chore: ⬆️ Update ggerganov/llama.cpp to 6026da52d6942b253df835070619775d849d0258 by @localai-bot in #3605
  • chore: ⬆️ Update ggerganov/whisper.cpp to 34972dbe221709323714fc8402f2e24041d48213 by @localai-bot in #3623
  • chore: ⬆️ Update ggerganov/llama.cpp to 63351143b2ea5efe9f8b9c61f553af8a51f1deff by @localai-bot in #3622
  • chore: ⬆️ Update ggerganov/llama.cpp to d09770cae71b416c032ec143dda530f7413c4038 by @localai-bot in #3626
  • chore: ⬆️ Update ggerganov/llama.cpp to c35e586ea57221844442c65a1172498c54971cb0 by @localai-bot in #3629
  • chore: ⬆️ Update ggerganov/llama.cpp to f0c7b5edf82aa200656fd88c11ae3a805d7130bf by @localai-bot in #3653
  • test: preliminary tests and merge fix for authv2 by @dave-gray101 in #3584

New Contributors

Full Changelog: v2.20.1...v2.21.0

Don't miss a new LocalAI release

NewReleases is sending notifications on new releases.