github mudler/LocalAI v2.24.0

latest releases: v2.24.2, v2.24.1
18 days ago

LocalAI release v2.24.0!

b642257566578

🚀 Highlights

  • Backend deprecation: We’ve removed rwkv.cpp and bert.cpp, replacing them with enhanced functionalities in llama.cpp for simpler installation and better performance.
  • New Backends Added: Introducing bark.cpp for text-to-audio and stablediffusion.cpp for image generation, both powered by the ggml framework.
  • Voice Activity Detection (VAD): Added support for silero-vad to detect speech in audio streams.
  • WebUI Improvements: Now supports API key authentication for enhanced security.
  • Real-Time Token Usage: Monitor token consumption during streamed outputs.
  • Expanded P2P Settings: Greater flexibility with new configuration options like listen_maddrs, dht_announce_maddrs, and bootstrap_peers.

📤 Backends Deprecation

As part of our cleanup efforts, the rwkv.cpp and bert.cpp backends have been deprecated. Their functionalities are now integrated into llama.cpp, offering a more streamlined and efficient experience.

🆕 New Backends Introduced

  • bark.cpp Backend: Transform text into realistic audio using Bark, a transformer-based text-to-audio model. Install it easily with:

    local-ai models install bark-cpp-small

    Or start it directly:

    local-ai run bark-cpp-small
  • stablediffusion.cpp Backend: Create high-quality images from textual descriptions using the Stable Diffusion backend, now leveraging the ggml framework.

  • Voice Activity Detection with silero-vad: Introducing support for accurate speech segment detection in audio streams. Install via:

    local-ai models install silero-vad

Or configure it through the WebUI.

🔒 WebUI Access with API Keys

The WebUI now supports API key authentication. If one or more API Keys are configured, the WebUI will automatically display a page to authenticate with.

🏆 Enhancements and Features

  • Real-Time Token Usage: Monitor token consumption dynamically during streamed outputs. This feature helps optimize performance and manage costs effectively.
  • P2P Configuration: New settings for advanced peer-to-peer mode:
    • listen_maddrs: Define specific multiaddresses for your node.
    • dht_announce_maddrs: Specify addresses to announce to the DHT network.
    • bootstrap_peers: Set custom bootstrap peers for initial connectivity.
      These options offer more control, especially in constrained networks or custom P2P environments.

🖼️ New Models in the Gallery

We've significantly expanded our model gallery with a variety of new models to cater to diverse AI applications. Among these:

  • Calme-3 Qwen2.5 Series: Enhanced language models offering improved understanding and generation capabilities.
  • Mistral-Nemo-Prism-12b: A powerful model designed for complex language tasks.
  • Llama 3.1 and 3.2 Series: Upgraded versions of the Llama models with better performance and accuracy.
  • Qwen2.5-Coder Series: Specialized models optimized for code generation and programming language understanding.
  • Rombos-Coder Series: Advanced coder models for sophisticated code-related tasks.
  • Silero-VAD: High-quality voice activity detection model for audio processing applications.
  • Bark-Cpp-Small: Lightweight audio generation model suitable for quick and efficient audio synthesis.

Explore these models and more in our updated model gallery to find the perfect fit for your project needs.

🐞 Bug Fixes and Improvements

  • Performance Enhancements: Resolved issues with AVX flags and optimized binaries for accelerated performance, especially on macOS systems.
  • Dependency Updates: Upgraded various dependencies to ensure compatibility, security, and performance improvements across the board.
  • Parsing Corrections: Fixed parsing issues related to maddr and ExtraLLamaCPPArgs in P2P configurations.

📚 Documentation and Examples

  • Updated Guides: Refreshed documentation with new configuration examples, making it easier to get started and integrate the latest features.

📥 How to Upgrade

To upgrade to LocalAI v2.24.0:

  • Download the Latest Release: Get the binaries from our GitHub Releases page.
  • Update Docker Image: Pull the latest Docker image using:
docker pull localai/localai:latest

See also the Documentation at: https://localai.io/basics/container/#standard-container-images

Happy hacking!

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

  • fix(hipblas): disable avx flags when accellerated bins are used by @mudler in #4167
  • chore(deps): bump sycl intel image by @mudler in #4201
  • fix(go.mod): add urfave/cli v2 by @mudler in #4206
  • chore(go.mod): add valyala/fasttemplate by @mudler in #4207
  • fix(p2p): parse maddr correctly by @mudler in #4219
  • fix(p2p): parse correctly ExtraLLamaCPPArgs by @mudler in #4220
  • fix(llama.cpp): embed metal file into result binary for darwin by @mudler in #4279

Exciting New Features 🎉

  • feat: add WebUI API token authorization by @mintyleaf in #4197
  • feat(p2p): add support for configuration of edgevpn listen_maddrs, dht_announce_maddrs and bootstrap_peers by @mintyleaf in #4200
  • feat(silero): add Silero-vad backend by @mudler in #4204
  • feat: include tokens usage for streamed output by @mintyleaf in #4282
  • feat(bark-cpp): add new bark.cpp backend by @mudler in #4287
  • feat(backend): add stablediffusion-ggml by @mudler in #4289

🧠 Models

  • models(gallery): add calme-3 qwen2.5 series by @mudler in #4107
  • models(gallery): add calme-3 qwenloi series by @mudler in #4108
  • models(gallery): add calme-3 llamaloi series by @mudler in #4109
  • models(gallery): add mn-tiramisu-12b by @mudler in #4110
  • models(gallery): add qwen2.5-coder-14b by @mudler in #4125
  • models(gallery): add qwen2.5-coder-3b-instruct by @mudler in #4126
  • models(gallery): add qwen2.5-coder-32b-instruct by @mudler in #4127
  • models(gallery): add qwen2.5-coder-14b-instruct by @mudler in #4128
  • models(gallery): add qwen2.5-coder-1.5b-instruct by @mudler in #4129
  • models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #4130
  • models(gallery): add qwen2.5-coder-7b-3x-instruct-ties-v1.2-i1 by @mudler in #4131
  • models(gallery): add qwen2.5-coder-7b-instruct-abliterated-i1 by @mudler in #4132
  • models(gallery): add rombos-coder-v2.5-qwen-7b by @mudler in #4133
  • models(gallery): add rombos-coder-v2.5-qwen-32b by @mudler in #4134
  • models(gallery): add rombos-coder-v2.5-qwen-14b by @mudler in #4135
  • models(gallery): add eva-qwen2.5-72b-v0.1-i1 by @mudler in #4136
  • models(gallery): add mistral-nemo-prism-12b by @mudler in #4141
  • models(gallery): add tess-3-llama-3.1-70b by @mudler in #4143
  • models(gallery): add celestial-harmony-14b-v1.0-experimental-1016-i1 by @mudler in #4145
  • models(gallery): add llama3.1-8b-enigma by @mudler in #4146
  • chore(model): add llama3.1-8b-cobalt to the gallery by @mudler in #4147
  • chore(model): add qwen2.5-32b-arliai-rpmax-v1.3 to the gallery by @mudler in #4148
  • chore(model): add llama3.2-3b-enigma to the gallery by @mudler in #4149
  • chore(model): add llama-3.1-8b-arliai-rpmax-v1.3 to the gallery by @mudler in #4150
  • chore(model): add magnum-12b-v2.5-kto-i1 to the gallery by @mudler in #4151
  • chore(model): add l3.1-8b-slush-i1 to the gallery by @mudler in #4152
  • models(gallery): add q2.5-ms-mistoria-72b-i1 by @mudler in #4158
  • chore(model): add l3.1-ms-astoria-70b-v2 to the gallery by @mudler in #4159
  • chore(model): add magnum-v2-4b-i1 to the gallery by @mudler in #4160
  • chore(model): add athene-v2-agent to the gallery by @mudler in #4161
  • chore(model): add athene-v2-chat to the gallery by @mudler in #4162
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4165
  • chore(model): add qwen2.5-7b-nerd-uncensored-v1.7 to the gallery by @mudler in #4171
  • chore(model): add llama3.2-3b-shiningvaliant2-i1 to the gallery by @mudler in #4174
  • chore(model): add l3.1-nemotron-sunfall-v0.7.0-i1 to the gallery by @mudler in #4175
  • chore(model): add evathene-v1.0 to the gallery by @mudler in #4176
  • chore(model): add miniclaus-qw1.5b-unamgs to the gallery by @mudler in #4177
  • chore(model): add silero-vad to the gallery by @mudler in #4210
  • models(gallery): add llama-mesh by @mudler in #4222
  • chore(model): add llama-doctor-3.2-3b-instruct to the gallery by @mudler in #4223
  • chore(model): add copus-2x8b-i1 to the gallery by @mudler in #4225
  • chore(model): add llama-3.1-8b-instruct-ortho-v3 to the gallery by @mudler in #4226
  • chore(model): add llama-3.1-tulu-3-8b-dpo to the gallery by @mudler in #4228
  • chore(model): add marco-o1 to the gallery by @mudler in #4229
  • chore(model): add onellm-doey-v1-llama-3.2-3b to the gallery by @mudler in #4230
  • chore(model): add llama-sentient-3.2-3b-instruct to the gallery by @mudler in #4235
  • chore(model): add qwen2.5-3b-smart-i1 to the gallery by @mudler in #4236
  • chore(model): add l3.1-aspire-heart-matrix-8b to the gallery by @mudler in #4237
  • chore(model): add dark-chivalry_v1.0-i1 to the gallery by @mudler in #4242
  • chore(model): add qwen2.5-coder-32b-instruct-uncensored-i1 to the gallery by @mudler in #4241
  • chore(model): add tulu-3.1-8b-supernova-i1 to the gallery by @mudler in #4243
  • chore(model): add steyrcannon-0.2-qwen2.5-72b to the gallery by @mudler in #4244
  • chore(model-gallery): ⬆️ update checksum by @localai-bot in #4261
  • chore(model): add llama-3.1_openscholar-8b to the gallery by @mudler in #4262
  • chore(model): add rwkv-6-world-7b to the gallery by @mudler in #4270
  • chore(model): add q2.5-ms-mistoria-72b-v2 model config by @mudler in #4275
  • chore(model): add llama-3.1-tulu-3-70b-dpo model config by @mudler in #4276
  • chore(model): add llama-3.1-tulu-3-8b-sft to the gallery by @mudler in #4277
  • chore(model): add eva-qwen2.5-72b-v0.2 to the gallery by @mudler in #4278
  • fix(rwkv model): add stoptoken by @mudler in #4283
  • chore(model gallery): add qwq-32b-preview by @mudler in #4284
  • chore(model gallery): add llama-smoltalk-3.2-1b-instruct by @mudler in #4285
  • chore(model gallery): add q2.5-32b-slush-i1 by @mudler in #4292
  • chore(model gallery): add freyja-v4.95-maldv-7b-non-fiction-i1 by @mudler in #4293
  • chore(model gallery): add qwestion-24b by @mudler in #4294
  • chore(model gallery): add volare-i1 by @mudler in #4296
  • chore(model gallery): add skywork-o1-open-llama-3.1-8b by @mudler in #4297
  • chore(model gallery): add teleut-7b by @mudler in #4298
  • chore(model gallery): add sparse-llama-3.1-8b-2of4 by @mudler in #4309
  • chore(model gallery): add qwen2.5-7b-homercreative-mix by @mudler in #4310
  • chore(model gallery): add bggpt-gemma-2-2.6b-it-v1.0 by @mudler in #4311
  • chore(model gallery): add cybercore-qwen-2.1-7b by @mudler in #4314
  • chore(model gallery): add chatty-harry_v3.0 by @mudler in #4315
  • chore(model gallery): add homercreativeanvita-mix-qw7b by @mudler in #4316
  • chore(model gallery): add flux.1-dev-ggml by @mudler in #4317
  • chore(model gallery): add bark-cpp-small by @mudler in #4318

📖 Documentation and examples

👒 Dependencies

  • chore: ⬆️ Update ggerganov/llama.cpp to 4b3a9212b602be3d4e2e3ca26efd796cef13c55e by @localai-bot in #4106
  • chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers by @dependabot in #4117
  • Revert "chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers" by @mudler in #4123
  • chore(deps): Bump dcarbone/install-yq-action from 1.1.1 to 1.2.0 by @dependabot in #4114
  • chore(deps): Bump sentence-transformers from 3.2.0 to 3.3.0 in /backend/python/sentencetransformers by @dependabot in #4120
  • chore: ⬆️ Update ggerganov/llama.cpp to 54ef9cfc726a799e6f454ac22c4815d037716eda by @localai-bot in #4122
  • chore: ⬆️ Update ggerganov/whisper.cpp to f19463ece2d43fd0b605dc513d8800eeb4e2315e by @localai-bot in #4139
  • chore: ⬆️ Update ggerganov/llama.cpp to fb4a0ec0833c71cff5a1a367ba375447ce6106eb by @localai-bot in #4140
  • chore(deps): bump llama-cpp to ae8de6d50a09d49545e0afab2e50cc4acfb280e2 by @mudler in #4157
  • chore: ⬆️ Update ggerganov/llama.cpp to 883d206fbd2c5b2b9b589a9328503b9005e146c9 by @localai-bot in #4164
  • chore(deps): bump grpcio to 1.68.0 by @mudler in #4166
  • chore: ⬆️ Update ggerganov/whisper.cpp to 01d3bd7d5ccd1956a7ddf1b57ee92d69f35aad93 by @localai-bot in #4163
  • chore: ⬆️ Update ggerganov/llama.cpp to db4cfd5dbc31c90f0d5c413a2e182d068b8ee308 by @localai-bot in #4169
  • chore(deps): Bump sentence-transformers from 3.3.0 to 3.3.1 in /backend/python/sentencetransformers by @dependabot in #4178
  • chore: ⬆️ Update ggerganov/llama.cpp to d3481e631661b5e9517f78908cdd58cee63c4903 by @localai-bot in #4196
  • chore: ⬆️ Update ggerganov/whisper.cpp to d24f981fb2fbf73ec7d72888c3129d1ed3f91916 by @localai-bot in #4195
  • chore(deps): Bump dcarbone/install-yq-action from 1.2.0 to 1.3.0 by @dependabot in #4182
  • chore(deps): Bump appleboy/ssh-action from 1.1.0 to 1.2.0 by @dependabot in #4183
  • chore: ⬆️ Update ggerganov/whisper.cpp to 6266a9f9e56a5b925e9892acf650f3eb1245814d by @localai-bot in #4202
  • chore: ⬆️ Update ggerganov/llama.cpp to 9fe0fb062630728e3c21b5839e3bce87bff2440a by @localai-bot in #4203
  • chore: ⬆️ Update ggerganov/llama.cpp to 9abe9eeae98b11fa93b82632b264126a010225ff by @localai-bot in #4212
  • chore: ⬆️ Update ggerganov/llama.cpp to a5e47592b6171ae21f3eaa1aba6fb2b707875063 by @localai-bot in #4221
  • chore: ⬆️ Update ggerganov/llama.cpp to 6dfcfef0787e9902df29f510b63621f60a09a50b by @localai-bot in #4227
  • chore: ⬆️ Update ggerganov/llama.cpp to 55ed008b2de01592659b9eba068ea01bb2f72160 by @localai-bot in #4232
  • chore: ⬆️ Update ggerganov/llama.cpp to cce5a9007572c6e9fa522296b77571d2e5071357 by @localai-bot in #4238
  • chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice by @dependabot in #4248
  • chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /backend/python/openvoice by @dependabot in #4249
  • chore(deps): Bump dcarbone/install-yq-action from 1.3.0 to 1.3.1 by @dependabot in #4253
  • chore(deps): bump llama.cpp to 47f931c8f9a26c072d71224bc8013cc66ea9e445 by @mudler in #4263
  • chore: ⬆️ Update ggerganov/llama.cpp to 30ec39832165627dd6ed98938df63adfc6e6a21a by @localai-bot in #4273
  • chore: ⬆️ Update ggerganov/llama.cpp to 3ad5451f3b75809e3033e4e577b9f60bcaf6676a by @localai-bot in #4280
  • chore: ⬆️ Update ggerganov/llama.cpp to dc22344088a7ee81a1e4f096459b03a72f24ccdc by @localai-bot in #4288
  • chore: ⬆️ Update ggerganov/llama.cpp to 3a8e9af402f7893423bdab444aa16c5d9a2d429a by @localai-bot in #4290
  • chore: ⬆️ Update ggerganov/llama.cpp to 0c39f44d70d058940fe2afe50cfc789e3e44d756 by @localai-bot in #4295
  • chore: ⬆️ Update ggerganov/llama.cpp to 5e1ed95583ca552a98d8528b73e1ff81249c2bf9 by @localai-bot in #4299
  • chore(deps): bump grpcio to 1.68.1 by @mudler in #4301
  • chore(deps): Bump docs/themes/hugo-theme-relearn from 28fce6b to be85052 by @dependabot in #4305
  • chore: ⬆️ Update ggerganov/llama.cpp to 8648c521010620c2daccfa1d26015c668ba2c717 by @localai-bot in #4307
  • chore: ⬆️ Update ggerganov/llama.cpp to cc98896db858df7aa40d0e16a505883ef196a482 by @localai-bot in #4312

Other Changes

  • chore: update jobresult_test.go by @eltociear in #4124
  • chore(linguist): add *.hpp files to linguist-vendored by @mudler in #4154
  • chore(api): return values from schema by @mudler in #4153
  • feat(swagger): update swagger by @localai-bot in #4155
  • chore(Makefile): default to non-native builds for llama.cpp by @mudler in #4173
  • chore(refactor): imply modelpath by @mudler in #4208
  • chore(go.mod): tidy by @mudler in #4209
  • feat(swagger): update swagger by @localai-bot in #4211
  • integrations: add Nextcloud by @meonkeys in #4233
  • Revert "chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice" by @mudler in #4267
  • Revert "chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /back… by @mudler in #4268
  • chore(scripts): handle summarization errors by @mudler in #4271

New Contributors

Full Changelog: v2.23.0...v2.24.0

Don't miss a new LocalAI release

NewReleases is sending notifications on new releases.