github mudler/LocalAI v2.18.0

2 days ago

local-ai-release-2 18-shadow

⭐ Highlights

Here’s a quick overview of what’s new in 2.18.0:

  • 🐳 Support for models in OCI registry (includes ollama)
  • 🌋 Support for llama.cpp with vulkan (container images only for now)
  • 🗣️ the transcription endpoint now can also translate with translate
  • ⚙️ Adds repeat_last_n and properties_order as model configurations
  • ⬆️ CUDA 12.5 Upgrade: we are now tracking the latest CUDA version (12.5).
  • 💎 Gemma 2 model support!

🐋 Support for OCI Images and Ollama Models

You can now specify models using oci:// and ollama:// prefixes in your YAML config files. Here’s an example for Ollama models:

parameters:
  model: ollama://...

Start the Ollama model directly with:

local-ai run ollama://gemma:2b

Or download only the model by using:

local-ai models install ollama://gemma:2b

For standard OCI images, use the oci:// prefix. To build a compatible container image, use docker for example.

Your Dockerfile should look like this:

FROM scratch
COPY ./my_gguf_file.gguf /

You can actually use it to store also other model types (for instance safetensors files for stable diffusion) and YAML config files !

🌋 Vulkan Support for Llama.cpp

We’ve introduced Vulkan support for Llama.cpp! Check out our new image tags latest-vulkan-ffmpeg-core and v2.18.0-vulkan-ffmpeg-core.

🗣️ Transcription and Translation

Our transcription endpoint now supports translation! Simply add translate: true to your transcription requests to translate the transcription to English.

⚙️ Enhanced Model Configuration

We’ve added new configuration options repeat_last_n and properties_order to give you more control. Here’s how you can set them up in your model YAML file:

# Force JSON to return properties in the specified order
function:
   grammar:
      properties_order: "name,arguments"

And for setting repeat_last_n (specific to Llama.cpp):

parameters:
   repeat_last_n: 64

💎 Gemma 2!

Screenshot from 2024-06-28 09-31-58

Google has just dropped gemma 2 models (blog post here), you can already install and run gemma 2 models in LocalAI with

local-ai run gemma-2-27b-it
local-ai run gemma-2-9b-it

What's Changed

Bug fixes 🐛

  • fix(install.sh): correctly handle systemd service installation by @mudler in #2627
  • fix(worker): use dynaload for single binaries by @mudler in #2620
  • fix(install.sh): fix version typo by @mudler in #2645
  • fix(install.sh): move ARCH detection so it works also for mac by @mudler in #2646
  • fix(cli): remove duplicate alias by @mudler in #2654

Exciting New Features 🎉

  • feat: Upgrade to CUDA 12.5 by @reneleonhardt in #2601
  • feat(oci): support OCI images and Ollama models by @mudler in #2628
  • feat(whisper): add translate option by @mudler in #2649
  • feat(vulkan): add vulkan support to the llama.cpp backend by @mudler in #2648
  • feat(ui): allow to select between all the available models in the chat by @mudler in #2657
  • feat(build): only build llama.cpp relevant targets by @mudler in #2659
  • feat(options): add repeat_last_n by @mudler in #2660
  • feat(grammar): expose properties_order by @mudler in #2662

🧠 Models

  • models(gallery): add l3-umbral-mind-rp-v1.0-8b-iq-imatrix by @mudler in #2608
  • models(gallery): ⬆️ update checksum by @localai-bot in #2607
  • models(gallery): add llama-3-sec-chat by @mudler in #2611
  • models(gallery): add llama-3-cursedstock-v1.8-8b-iq-imatrix by @mudler in #2612
  • models(gallery): add llama3-8b-darkidol-1.1-iq-imatrix by @mudler in #2613
  • models(gallery): add magnum-72b-v1 by @mudler in #2614
  • models(gallery): add qwen2-1.5b-ita by @mudler in #2615
  • models(gallery): add hermes-2-theta-llama-3-70b by @mudler in #2626
  • models(gallery): ⬆️ update checksum by @localai-bot in #2630
  • models(gallery): add dark-idol-1.2 by @mudler in #2663
  • models(gallery): add einstein v7 qwen2 by @mudler in #2664
  • models(gallery): add arcee-spark by @mudler in #2665
  • models(gallery): add gemma2-9b-it and gemma2-27b-it by @mudler in #2670

📖 Documentation and examples

👒 Dependencies

Other Changes

New Contributors

Full Changelog: v2.17.1...v2.18.0

Don't miss a new LocalAI release

NewReleases is sending notifications on new releases.