mudler/LocalAI v2.18.0 on GitHub

⭐ Highlights

Here’s a quick overview of what’s new in 2.18.0:

🐳 Support for models in OCI registry (includes ollama)
🌋 Support for llama.cpp with vulkan (container images only for now)
🗣️ the transcription endpoint now can also translate with translate
⚙️ Adds repeat_last_n and properties_order as model configurations
⬆️ CUDA 12.5 Upgrade: we are now tracking the latest CUDA version (12.5).
💎 Gemma 2 model support!

🐋 Support for OCI Images and Ollama Models

You can now specify models using oci:// and ollama:// prefixes in your YAML config files. Here’s an example for Ollama models:

parameters:
  model: ollama://...

Start the Ollama model directly with:

local-ai run ollama://gemma:2b

Or download only the model by using:

local-ai models install ollama://gemma:2b

For standard OCI images, use the oci:// prefix. To build a compatible container image, use docker for example.

Your Dockerfile should look like this:

FROM scratch
COPY ./my_gguf_file.gguf /

You can actually use it to store also other model types (for instance safetensors files for stable diffusion) and YAML config files !

🌋 Vulkan Support for Llama.cpp

We’ve introduced Vulkan support for Llama.cpp! Check out our new image tags latest-vulkan-ffmpeg-core and v2.18.0-vulkan-ffmpeg-core.

🗣️ Transcription and Translation

Our transcription endpoint now supports translation! Simply add translate: true to your transcription requests to translate the transcription to English.

⚙️ Enhanced Model Configuration

We’ve added new configuration options repeat_last_n and properties_order to give you more control. Here’s how you can set them up in your model YAML file:

# Force JSON to return properties in the specified order
function:
   grammar:
      properties_order: "name,arguments"

And for setting repeat_last_n (specific to Llama.cpp):

parameters:
   repeat_last_n: 64

💎 Gemma 2!

Google has just dropped gemma 2 models (blog post here), you can already install and run gemma 2 models in LocalAI with

local-ai run gemma-2-27b-it
local-ai run gemma-2-9b-it

What's Changed

Bug fixes 🐛

fix(install.sh): correctly handle systemd service installation by @mudler in #2627
fix(worker): use dynaload for single binaries by @mudler in #2620
fix(install.sh): fix version typo by @mudler in #2645
fix(install.sh): move ARCH detection so it works also for mac by @mudler in #2646
fix(cli): remove duplicate alias by @mudler in #2654

Exciting New Features 🎉

feat: Upgrade to CUDA 12.5 by @reneleonhardt in #2601
feat(oci): support OCI images and Ollama models by @mudler in #2628
feat(whisper): add translate option by @mudler in #2649
feat(vulkan): add vulkan support to the llama.cpp backend by @mudler in #2648
feat(ui): allow to select between all the available models in the chat by @mudler in #2657
feat(build): only build llama.cpp relevant targets by @mudler in #2659
feat(options): add repeat_last_n by @mudler in #2660
feat(grammar): expose properties_order by @mudler in #2662

🧠 Models

models(gallery): add l3-umbral-mind-rp-v1.0-8b-iq-imatrix by @mudler in #2608
models(gallery): ⬆️ update checksum by @localai-bot in #2607
models(gallery): add llama-3-sec-chat by @mudler in #2611
models(gallery): add llama-3-cursedstock-v1.8-8b-iq-imatrix by @mudler in #2612
models(gallery): add llama3-8b-darkidol-1.1-iq-imatrix by @mudler in #2613
models(gallery): add magnum-72b-v1 by @mudler in #2614
models(gallery): add qwen2-1.5b-ita by @mudler in #2615
models(gallery): add hermes-2-theta-llama-3-70b by @mudler in #2626
models(gallery): ⬆️ update checksum by @localai-bot in #2630
models(gallery): add dark-idol-1.2 by @mudler in #2663
models(gallery): add einstein v7 qwen2 by @mudler in #2664
models(gallery): add arcee-spark by @mudler in #2665
models(gallery): add gemma2-9b-it and gemma2-27b-it by @mudler in #2670

📖 Documentation and examples

docs: update to include installer and update advanced YAML options by @mudler in #2631
feat(swagger): update swagger by @localai-bot in #2651
feat(swagger): update swagger by @localai-bot in #2666
telegram-bot example: Update LocalAI version (fixes #2638) by @greygoo in #2640

👒 Dependencies

⬆️ Update docs version mudler/LocalAI by @localai-bot in #2605
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2606
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2617
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2629
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2632
deps(llama.cpp): bump to latest, update build variables by @mudler in #2669
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2652
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2671

Other Changes

ci: bump parallel jobs by @mudler in #2633
chore: fix go.mod module by @sozercan in #2635
rf: centralize base64 image handling and secscan cleanup by @dave-gray101 in #2595
refactor: gallery inconsistencies by @mudler in #2647

New Contributors

@greygoo made their first contribution in #2640

Full Changelog: v2.17.1...v2.18.0