LocalAI release v2.24.0!

🚀 Highlights

Backend deprecation: We’ve removed rwkv.cpp and bert.cpp, replacing them with enhanced functionalities in llama.cpp for simpler installation and better performance.
New Backends Added: Introducing bark.cpp for text-to-audio and stablediffusion.cpp for image generation, both powered by the ggml framework.
Voice Activity Detection (VAD): Added support for silero-vad to detect speech in audio streams.
WebUI Improvements: Now supports API key authentication for enhanced security.
Real-Time Token Usage: Monitor token consumption during streamed outputs.
Expanded P2P Settings: Greater flexibility with new configuration options like listen_maddrs, dht_announce_maddrs, and bootstrap_peers.

📤 Backends Deprecation

As part of our cleanup efforts, the rwkv.cpp and bert.cpp backends have been deprecated. Their functionalities are now integrated into llama.cpp, offering a more streamlined and efficient experience.

🆕 New Backends Introduced

bark.cpp Backend: Transform text into realistic audio using Bark, a transformer-based text-to-audio model. Install it easily with:
```
local-ai models install bark-cpp-small
```
Or start it directly:
```
local-ai run bark-cpp-small
```
stablediffusion.cpp Backend: Create high-quality images from textual descriptions using the Stable Diffusion backend, now leveraging the ggml framework.
Voice Activity Detection with silero-vad: Introducing support for accurate speech segment detection in audio streams. Install via:
```
local-ai models install silero-vad
```

Or configure it through the WebUI.

🔒 WebUI Access with API Keys

The WebUI now supports API key authentication. If one or more API Keys are configured, the WebUI will automatically display a page to authenticate with.

🏆 Enhancements and Features

Real-Time Token Usage: Monitor token consumption dynamically during streamed outputs. This feature helps optimize performance and manage costs effectively.
P2P Configuration: New settings for advanced peer-to-peer mode:
- listen_maddrs: Define specific multiaddresses for your node.
- dht_announce_maddrs: Specify addresses to announce to the DHT network.
- bootstrap_peers: Set custom bootstrap peers for initial connectivity.
  These options offer more control, especially in constrained networks or custom P2P environments.

🖼️ New Models in the Gallery

We've significantly expanded our model gallery with a variety of new models to cater to diverse AI applications. Among these:

Calme-3 Qwen2.5 Series: Enhanced language models offering improved understanding and generation capabilities.
Mistral-Nemo-Prism-12b: A powerful model designed for complex language tasks.
Llama 3.1 and 3.2 Series: Upgraded versions of the Llama models with better performance and accuracy.
Qwen2.5-Coder Series: Specialized models optimized for code generation and programming language understanding.
Rombos-Coder Series: Advanced coder models for sophisticated code-related tasks.
Silero-VAD: High-quality voice activity detection model for audio processing applications.
Bark-Cpp-Small: Lightweight audio generation model suitable for quick and efficient audio synthesis.

Explore these models and more in our updated model gallery to find the perfect fit for your project needs.

🐞 Bug Fixes and Improvements

Performance Enhancements: Resolved issues with AVX flags and optimized binaries for accelerated performance, especially on macOS systems.
Dependency Updates: Upgraded various dependencies to ensure compatibility, security, and performance improvements across the board.
Parsing Corrections: Fixed parsing issues related to maddr and ExtraLLamaCPPArgs in P2P configurations.

📚 Documentation and Examples

Updated Guides: Refreshed documentation with new configuration examples, making it easier to get started and integrate the latest features.

📥 How to Upgrade

To upgrade to LocalAI v2.24.0:

Download the Latest Release: Get the binaries from our GitHub Releases page.
Update Docker Image: Pull the latest Docker image using:

docker pull localai/localai:latest

See also the Documentation at: https://localai.io/basics/container/#standard-container-images

Happy hacking!

What's Changed

Breaking Changes 🛠

feat(models): use rwkv from llama.cpp by @mudler in #4264
feat(backends): Drop bert.cpp by @mudler in #4272

Bug fixes 🐛

fix(hipblas): disable avx flags when accellerated bins are used by @mudler in #4167
chore(deps): bump sycl intel image by @mudler in #4201
fix(go.mod): add urfave/cli v2 by @mudler in #4206
chore(go.mod): add valyala/fasttemplate by @mudler in #4207
fix(p2p): parse maddr correctly by @mudler in #4219
fix(p2p): parse correctly ExtraLLamaCPPArgs by @mudler in #4220
fix(llama.cpp): embed metal file into result binary for darwin by @mudler in #4279

Exciting New Features 🎉

feat: add WebUI API token authorization by @mintyleaf in #4197
feat(p2p): add support for configuration of edgevpn listen_maddrs, dht_announce_maddrs and bootstrap_peers by @mintyleaf in #4200
feat(silero): add Silero-vad backend by @mudler in #4204
feat: include tokens usage for streamed output by @mintyleaf in #4282
feat(bark-cpp): add new bark.cpp backend by @mudler in #4287
feat(backend): add stablediffusion-ggml by @mudler in #4289

🧠 Models

models(gallery): add calme-3 qwen2.5 series by @mudler in #4107
models(gallery): add calme-3 qwenloi series by @mudler in #4108
models(gallery): add calme-3 llamaloi series by @mudler in #4109
models(gallery): add mn-tiramisu-12b by @mudler in #4110
models(gallery): add qwen2.5-coder-14b by @mudler in #4125
models(gallery): add qwen2.5-coder-3b-instruct by @mudler in #4126
models(gallery): add qwen2.5-coder-32b-instruct by @mudler in #4127
models(gallery): add qwen2.5-coder-14b-instruct by @mudler in #4128
models(gallery): add qwen2.5-coder-1.5b-instruct by @mudler in #4129
models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #4130
models(gallery): add qwen2.5-coder-7b-3x-instruct-ties-v1.2-i1 by @mudler in #4131
models(gallery): add qwen2.5-coder-7b-instruct-abliterated-i1 by @mudler in #4132
models(gallery): add rombos-coder-v2.5-qwen-7b by @mudler in #4133
models(gallery): add rombos-coder-v2.5-qwen-32b by @mudler in #4134
models(gallery): add rombos-coder-v2.5-qwen-14b by @mudler in #4135
models(gallery): add eva-qwen2.5-72b-v0.1-i1 by @mudler in #4136
models(gallery): add mistral-nemo-prism-12b by @mudler in #4141
models(gallery): add tess-3-llama-3.1-70b by @mudler in #4143
models(gallery): add celestial-harmony-14b-v1.0-experimental-1016-i1 by @mudler in #4145
models(gallery): add llama3.1-8b-enigma by @mudler in #4146
chore(model): add llama3.1-8b-cobalt to the gallery by @mudler in #4147
chore(model): add qwen2.5-32b-arliai-rpmax-v1.3 to the gallery by @mudler in #4148
chore(model): add llama3.2-3b-enigma to the gallery by @mudler in #4149
chore(model): add llama-3.1-8b-arliai-rpmax-v1.3 to the gallery by @mudler in #4150
chore(model): add magnum-12b-v2.5-kto-i1 to the gallery by @mudler in #4151
chore(model): add l3.1-8b-slush-i1 to the gallery by @mudler in #4152
models(gallery): add q2.5-ms-mistoria-72b-i1 by @mudler in #4158
chore(model): add l3.1-ms-astoria-70b-v2 to the gallery by @mudler in #4159
chore(model): add magnum-v2-4b-i1 to the gallery by @mudler in #4160
chore(model): add athene-v2-agent to the gallery by @mudler in #4161
chore(model): add athene-v2-chat to the gallery by @mudler in #4162
chore(model-gallery): ⬆️ update checksum by @localai-bot in #4165
chore(model): add qwen2.5-7b-nerd-uncensored-v1.7 to the gallery by @mudler in #4171
chore(model): add llama3.2-3b-shiningvaliant2-i1 to the gallery by @mudler in #4174
chore(model): add l3.1-nemotron-sunfall-v0.7.0-i1 to the gallery by @mudler in #4175
chore(model): add evathene-v1.0 to the gallery by @mudler in #4176
chore(model): add miniclaus-qw1.5b-unamgs to the gallery by @mudler in #4177
chore(model): add silero-vad to the gallery by @mudler in #4210
models(gallery): add llama-mesh by @mudler in #4222
chore(model): add llama-doctor-3.2-3b-instruct to the gallery by @mudler in #4223
chore(model): add copus-2x8b-i1 to the gallery by @mudler in #4225
chore(model): add llama-3.1-8b-instruct-ortho-v3 to the gallery by @mudler in #4226
chore(model): add llama-3.1-tulu-3-8b-dpo to the gallery by @mudler in #4228
chore(model): add marco-o1 to the gallery by @mudler in #4229
chore(model): add onellm-doey-v1-llama-3.2-3b to the gallery by @mudler in #4230
chore(model): add llama-sentient-3.2-3b-instruct to the gallery by @mudler in #4235
chore(model): add qwen2.5-3b-smart-i1 to the gallery by @mudler in #4236
chore(model): add l3.1-aspire-heart-matrix-8b to the gallery by @mudler in #4237
chore(model): add dark-chivalry_v1.0-i1 to the gallery by @mudler in #4242
chore(model): add qwen2.5-coder-32b-instruct-uncensored-i1 to the gallery by @mudler in #4241
chore(model): add tulu-3.1-8b-supernova-i1 to the gallery by @mudler in #4243
chore(model): add steyrcannon-0.2-qwen2.5-72b to the gallery by @mudler in #4244
chore(model-gallery): ⬆️ update checksum by @localai-bot in #4261
chore(model): add llama-3.1_openscholar-8b to the gallery by @mudler in #4262
chore(model): add rwkv-6-world-7b to the gallery by @mudler in #4270
chore(model): add q2.5-ms-mistoria-72b-v2 model config by @mudler in #4275
chore(model): add llama-3.1-tulu-3-70b-dpo model config by @mudler in #4276
chore(model): add llama-3.1-tulu-3-8b-sft to the gallery by @mudler in #4277
chore(model): add eva-qwen2.5-72b-v0.2 to the gallery by @mudler in #4278
fix(rwkv model): add stoptoken by @mudler in #4283
chore(model gallery): add qwq-32b-preview by @mudler in #4284
chore(model gallery): add llama-smoltalk-3.2-1b-instruct by @mudler in #4285
chore(model gallery): add q2.5-32b-slush-i1 by @mudler in #4292
chore(model gallery): add freyja-v4.95-maldv-7b-non-fiction-i1 by @mudler in #4293
chore(model gallery): add qwestion-24b by @mudler in #4294
chore(model gallery): add volare-i1 by @mudler in #4296
chore(model gallery): add skywork-o1-open-llama-3.1-8b by @mudler in #4297
chore(model gallery): add teleut-7b by @mudler in #4298
chore(model gallery): add sparse-llama-3.1-8b-2of4 by @mudler in #4309
chore(model gallery): add qwen2.5-7b-homercreative-mix by @mudler in #4310
chore(model gallery): add bggpt-gemma-2-2.6b-it-v1.0 by @mudler in #4311
chore(model gallery): add cybercore-qwen-2.1-7b by @mudler in #4314
chore(model gallery): add chatty-harry_v3.0 by @mudler in #4315
chore(model gallery): add homercreativeanvita-mix-qw7b by @mudler in #4316
chore(model gallery): add flux.1-dev-ggml by @mudler in #4317
chore(model gallery): add bark-cpp-small by @mudler in #4318

📖 Documentation and examples

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #4105
fix: #4215 404 in documentation due to migrated configuration examples by @rmmonster in #4216
chore(docs): integrating LocalAI with Microsoft Word by @GPTLocalhost in #4218
add new community integrations by @JackBekket in #4224

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp to 4b3a9212b602be3d4e2e3ca26efd796cef13c55e by @localai-bot in #4106
chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers by @dependabot in #4117
Revert "chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers" by @mudler in #4123
chore(deps): Bump dcarbone/install-yq-action from 1.1.1 to 1.2.0 by @dependabot in #4114
chore(deps): Bump sentence-transformers from 3.2.0 to 3.3.0 in /backend/python/sentencetransformers by @dependabot in #4120
chore: ⬆️ Update ggerganov/llama.cpp to 54ef9cfc726a799e6f454ac22c4815d037716eda by @localai-bot in #4122
chore: ⬆️ Update ggerganov/whisper.cpp to f19463ece2d43fd0b605dc513d8800eeb4e2315e by @localai-bot in #4139
chore: ⬆️ Update ggerganov/llama.cpp to fb4a0ec0833c71cff5a1a367ba375447ce6106eb by @localai-bot in #4140
chore(deps): bump llama-cpp to ae8de6d50a09d49545e0afab2e50cc4acfb280e2 by @mudler in #4157
chore: ⬆️ Update ggerganov/llama.cpp to 883d206fbd2c5b2b9b589a9328503b9005e146c9 by @localai-bot in #4164
chore(deps): bump grpcio to 1.68.0 by @mudler in #4166
chore: ⬆️ Update ggerganov/whisper.cpp to 01d3bd7d5ccd1956a7ddf1b57ee92d69f35aad93 by @localai-bot in #4163
chore: ⬆️ Update ggerganov/llama.cpp to db4cfd5dbc31c90f0d5c413a2e182d068b8ee308 by @localai-bot in #4169
chore(deps): Bump sentence-transformers from 3.3.0 to 3.3.1 in /backend/python/sentencetransformers by @dependabot in #4178
chore: ⬆️ Update ggerganov/llama.cpp to d3481e631661b5e9517f78908cdd58cee63c4903 by @localai-bot in #4196
chore: ⬆️ Update ggerganov/whisper.cpp to d24f981fb2fbf73ec7d72888c3129d1ed3f91916 by @localai-bot in #4195
chore(deps): Bump dcarbone/install-yq-action from 1.2.0 to 1.3.0 by @dependabot in #4182
chore(deps): Bump appleboy/ssh-action from 1.1.0 to 1.2.0 by @dependabot in #4183
chore: ⬆️ Update ggerganov/whisper.cpp to 6266a9f9e56a5b925e9892acf650f3eb1245814d by @localai-bot in #4202
chore: ⬆️ Update ggerganov/llama.cpp to 9fe0fb062630728e3c21b5839e3bce87bff2440a by @localai-bot in #4203
chore: ⬆️ Update ggerganov/llama.cpp to 9abe9eeae98b11fa93b82632b264126a010225ff by @localai-bot in #4212
chore: ⬆️ Update ggerganov/llama.cpp to a5e47592b6171ae21f3eaa1aba6fb2b707875063 by @localai-bot in #4221
chore: ⬆️ Update ggerganov/llama.cpp to 6dfcfef0787e9902df29f510b63621f60a09a50b by @localai-bot in #4227
chore: ⬆️ Update ggerganov/llama.cpp to 55ed008b2de01592659b9eba068ea01bb2f72160 by @localai-bot in #4232
chore: ⬆️ Update ggerganov/llama.cpp to cce5a9007572c6e9fa522296b77571d2e5071357 by @localai-bot in #4238
chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice by @dependabot in #4248
chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /backend/python/openvoice by @dependabot in #4249
chore(deps): Bump dcarbone/install-yq-action from 1.3.0 to 1.3.1 by @dependabot in #4253
chore(deps): bump llama.cpp to 47f931c8f9a26c072d71224bc8013cc66ea9e445 by @mudler in #4263
chore: ⬆️ Update ggerganov/llama.cpp to 30ec39832165627dd6ed98938df63adfc6e6a21a by @localai-bot in #4273
chore: ⬆️ Update ggerganov/llama.cpp to 3ad5451f3b75809e3033e4e577b9f60bcaf6676a by @localai-bot in #4280
chore: ⬆️ Update ggerganov/llama.cpp to dc22344088a7ee81a1e4f096459b03a72f24ccdc by @localai-bot in #4288
chore: ⬆️ Update ggerganov/llama.cpp to 3a8e9af402f7893423bdab444aa16c5d9a2d429a by @localai-bot in #4290
chore: ⬆️ Update ggerganov/llama.cpp to 0c39f44d70d058940fe2afe50cfc789e3e44d756 by @localai-bot in #4295
chore: ⬆️ Update ggerganov/llama.cpp to 5e1ed95583ca552a98d8528b73e1ff81249c2bf9 by @localai-bot in #4299
chore(deps): bump grpcio to 1.68.1 by @mudler in #4301
chore(deps): Bump docs/themes/hugo-theme-relearn from 28fce6b to be85052 by @dependabot in #4305
chore: ⬆️ Update ggerganov/llama.cpp to 8648c521010620c2daccfa1d26015c668ba2c717 by @localai-bot in #4307
chore: ⬆️ Update ggerganov/llama.cpp to cc98896db858df7aa40d0e16a505883ef196a482 by @localai-bot in #4312

Other Changes

chore: update jobresult_test.go by @eltociear in #4124
chore(linguist): add *.hpp files to linguist-vendored by @mudler in #4154
chore(api): return values from schema by @mudler in #4153
feat(swagger): update swagger by @localai-bot in #4155
chore(Makefile): default to non-native builds for llama.cpp by @mudler in #4173
chore(refactor): imply modelpath by @mudler in #4208
chore(go.mod): tidy by @mudler in #4209
feat(swagger): update swagger by @localai-bot in #4211
integrations: add Nextcloud by @meonkeys in #4233
Revert "chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice" by @mudler in #4267
Revert "chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /back… by @mudler in #4268
chore(scripts): handle summarization errors by @mudler in #4271

New Contributors

@mintyleaf made their first contribution in #4197
@rmmonster made their first contribution in #4216
@GPTLocalhost made their first contribution in #4218
@JackBekket made their first contribution in #4224
@meonkeys made their first contribution in #4233

Full Changelog: v2.23.0...v2.24.0

mudler/LocalAI v2.24.0 on GitHub