mudler/LocalAI v2.17.0 on GitHub

Ahoj! this new release of LocalAI comes with tons of updates, and enhancements behind the scenes!

🌟 Highlights TLDR;

Automatic identification of GGUF models
New WebUI page to talk with an LLM!
https://models.localai.io is live! 🚀
Better arm64 and Apple silicon support
More models to the gallery!
New quickstart installer script
Enhancements to mixed grammar support
Major improvements to transformers
Linux single binary now supports rocm, nvidia, and intel

🤖 Automatic model identification for llama.cpp-based models

Just drop your GGUF files into the model folders, and let LocalAI handle the configurations. YAML files are now reserved for those who love to tinker with advanced setups.

🔊 Talk to your LLM!

Introduced a new page that allows direct interaction with the LLM using audio transcription and TTS capabilities. This feature is so fun - now you can just talk with any LLM with a couple of clicks away.

🍏 Apple single-binary

Experience enhanced support for the Apple ecosystem with a comprehensive single-binary that packs all necessary libraries, ensuring LocalAI runs smoothly on MacOS and ARM64 architectures.

ARM64

Expanded our support for ARM64 with new Docker images and single binary options, ensuring better compatibility and performance on ARM-based systems.

Note: currently we support only arm core images, for instance: localai/localai:master-ffmpeg-core, localai/localai:latest-ffmpeg-core, localai/localai:v2.17.0-ffmpeg-core.

🐞 Bug Fixes and small enhancements

We’ve ironed out several issues, including image endpoint response types and other minor problems, boosting the stability and reliability of our applications. It is now also possible to enable CSRF when starting LocalAI, thanks to @dave-gray101.

🌐 Models and Galleries

Enhanced the model gallery with new additions like Mirai Nova, Mahou, and several updates to existing models ensuring better performance and accuracy.

Now you can check new models also in https://models.localai.io, without running LocalAI!

Installation and Setup

A new install.sh script is now available for quick and hassle-free installations, streamlining the setup process for new users.

curl https://localai.io/install.sh | sh

Installation can be configured with Environment variables, for example:

curl https://localai.io/install.sh | VAR=value sh

List of the Environment Variables:

DOCKER_INSTALL: Set to "true" to enable the installation of Docker images.
USE_AIO: Set to "true" to use the all-in-one LocalAI Docker image.
API_KEY: Specify an API key for accessing LocalAI, if required.
CORE_IMAGES: Set to "true" to download core LocalAI images.
PORT: Specifies the port on which LocalAI will run (default is 8080).
THREADS: Number of processor threads the application should use. Defaults to the number of logical cores minus one.
VERSION: Specifies the version of LocalAI to install. Defaults to the latest available version.
MODELS_PATH: Directory path where LocalAI models are stored (default is /usr/share/local-ai/models).

We are looking into improving the installer, and as this is a first iteration any feedback is welcome! Open up an issue if something doesn't work for you!

Enhancements to mixed grammar support

Mixed grammar support continues receiving improvements behind the scenes.

🐍 Transformers backend enhancements

Temperature = 0 correctly handled as greedy search
Handles custom words as stop words
Implement KV cache
Phi 3 no more requires trust_remote_code: true flag

Shout-out to @fakezeta for these enhancements!

Install models with the CLI

Now the CLI can install models directly from the gallery. For instance:

local-ai run <model_name_in gallery>

This command ensures the model is installed in the model folder at startup.

🐧 Linux single binary now supports rocm, nvidia, and intel

Single binaries for Linux now contain Intel, AMD GPU, and NVIDIA support. Note that you need to install the dependencies separately in the system to leverage these features. In upcoming releases, this requirement will be handled by the installer script.

📣 Let's Make Some Noise!

A gigantic THANK YOU to everyone who’s contributed—your feedback, bug squashing, and feature suggestions are what make LocalAI shine. To all our heroes out there supporting other users and sharing their expertise, you’re the real MVPs!

Remember, LocalAI thrives on community support—not big corporate bucks. If you love what we're building, show some love! A shoutout on social (@LocalAI_OSS and @mudler_it on twitter/X), joining our sponsors, or simply starring us on GitHub makes all the difference.

Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy

Thanks a ton, and.. enjoy this release!

What's Changed

Bug fixes 🐛

fix: gpu fetch device info by @sozercan in #2403
fix(watcher): do not emit fatal errors by @mudler in #2410
fix: install pytorch from proper index for hipblas builds by @cryptk in #2413
fix: pin version of setuptools for intel builds to work around #2406 by @cryptk in #2414
bugfix: CUDA acceleration not working by @fakezeta in #2475
fix: pkg/downloader should respect basePath for file:// urls by @dave-gray101 in #2481
fix: chat webui response parsing by @sozercan in #2515
fix(stream): do not break channel consumption by @mudler in #2517
fix(Makefile): enable STATIC on dist by @mudler in #2569

Exciting New Features 🎉

feat(images): do not install python deps in the core image by @mudler in #2425
feat(hipblas): extend default hipblas GPU_TARGETS by @mudler in #2426
feat(build): add arm64 core containers by @mudler in #2421
feat(functions): allow parallel calls with mixed/no grammars by @mudler in #2432
feat(image): support response_type in the OpenAI API request by @prajwalnayak7 in #2347
feat(swagger): update swagger by @localai-bot in #2436
feat(functions): better free string matching, allow to expect strings after JSON by @mudler in #2445
build(Makefile): add back single target to build native llama-cpp by @mudler in #2448
feat(functions): allow response_regex to be a list by @mudler in #2447
TTS API improvements by @blob42 in #2308
feat(transformers): various enhancements to the transformers backend by @fakezeta in #2468
feat(webui): enhance card visibility by @mudler in #2473
feat(default): use number of physical cores as default by @mudler in #2483
feat: fiber CSRF by @dave-gray101 in #2482
feat(amdgpu): try to build in single binary by @mudler in #2485
feat:OpaqueErrors to hide error information by @dave-gray101 in #2486
build(intel): bundle intel variants in single-binary by @mudler in #2494
feat(install): add install.sh for quick installs by @mudler in #2489
feat(llama.cpp): guess model defaults from file by @mudler in #2522
feat(ui): add page to talk with voice, transcription, and tts by @mudler in #2520
feat(arm64): enable single-binary builds by @mudler in #2490
feat(util): add util command to print GGUF informations by @mudler in #2528
feat(defaults): add defaults for Command-R models by @mudler in #2529
feat(detection): detect by template in gguf file, add qwen2, phi, mistral and chatml by @mudler in #2536
feat(gallery): show available models in website, allow local-ai models install to install from galleries by @mudler in #2555
feat(gallery): uniform download from CLI by @mudler in #2559
feat(guesser): identify gemma models by @mudler in #2561
feat(binary): support extracted bundled libs on darwin by @mudler in #2563
feat(darwin): embed grpc libs by @mudler in #2567
feat(build): bundle libs for arm64 and x86 linux binaries by @mudler in #2572
feat(libpath): refactor and expose functions for external library paths by @mudler in #2578

🧠 Models

models(gallery): add Mirai Nova by @mudler in #2405
models(gallery): add Mahou by @mudler in #2411
models(gallery): add minicpm by @mudler in #2412
models(gallery): add poppy porpoise 0.85 by @mudler in #2415
models(gallery): add alpha centauri by @mudler in #2416
models(gallery): add cream-phi-13b by @mudler in #2417
models(gallery): add stheno-mahou by @mudler in #2418
models(gallery): add iterative-dpo, fix minicpm by @mudler in #2422
models(gallery): add una-thepitbull by @mudler in #2435
models(gallery): add halu by @mudler in #2434
models(gallery): add neuraldaredevil by @mudler in #2439
models(gallery): add Codestral by @mudler in #2442
models(gallery): add mopeymule by @mudler in #2449
models(gallery): ⬆️ update checksum by @localai-bot in #2451
models(gallery): add anjir by @mudler in #2454
models(gallery): add llama3-11b by @mudler in #2455
models(gallery): add ultron by @mudler in #2456
models(gallery): add poppy porpoise 1.0 by @mudler in #2459
models(gallery): add Neural SOVLish Devil by @mudler in #2460
models(gallery): add all whisper variants by @mudler in #2462
models(gallery): ⬆️ update checksum by @localai-bot in #2463
models(gallery): add gemma-2b by @mudler in #2466
models(gallery): add fimbulvetr iqmatrix version by @mudler in #2470
models(gallery): add new poppy porpoise versions by @mudler in #2471
models(gallery): add dolphin-2.9.2-Phi-3-Medium by @mudler in #2492
models(gallery): add dolphin-2.9.2-phi-3-Medium-abliterated by @mudler in #2495
models(gallery): add nyun by @mudler in #2496
models(gallery): add phi-3-4x4b by @mudler in #2497
models(gallery): add llama-3-instruct-8b-SimPO-ExPO by @mudler in #2498
models(gallery): add Llama-3-Yggdrasil-2.0-8B by @mudler in #2499
models(gallery): add l3-8b-stheno-v3.2-iq-imatrix by @mudler in #2500
models(gallery): add llama3-8B-aifeifei-1.0-iq-imatrix by @mudler in #2509
models(gallery): add rawr_llama3_8b-iq-imatrix by @mudler in #2510
models(gallery): add llama3-8b-feifei-1.0-iq-imatrix by @mudler in #2511
models(gallery): ⬆️ update checksum by @localai-bot in #2519
models(gallery): add llama3-8B-aifeifei-1.2-iq-imatrix by @mudler in #2544
models(gallery): add hathor-l3-8b-v.01-iq-imatrix by @mudler in #2545
models(gallery): add l3-aethora-15b by @mudler in #2546
models(gallery): add llama-salad-8x8b by @mudler in #2547
models(gallery): add average_normie_v3.69_8b-iq-imatrix by @mudler in #2548
models(gallery): add duloxetine by @mudler in #2549
models(gallery): add badger-lambda-llama-3-8b by @mudler in #2550
models(gallery): add firefly-gemma-7b by @mudler in #2576
models(gallery): add dolphin-qwen by @mudler in #2580
models(gallery): add tess-v2.5-phi-3-medium-128k-14b by @mudler in #2581
models(gallery): add hathor_stable-v0.2-l3-8b by @mudler in #2582
models(gallery): add samantha-qwen2 by @mudler in #2586
models(gallery): add gemma-1.1-7b-it by @mudler in #2588

📖 Documentation and examples

Update quickstart.md by @mudler in #2404
docs: fix p2p commands by @mudler in #2472
README: update sponsors list by @mudler in #2476
Add integrations by @reid41 in #2535
docs(gallery): lazy-load images by @mudler in #2557
Fix standard image latest Docker tags by @nwithan8 in #2574

👒 Dependencies

⬆️ Update ggerganov/llama.cpp by @localai-bot in #2399
⬆️ Update docs version mudler/LocalAI by @localai-bot in #2398
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2408
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2409
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2419
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2427
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2428
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2433
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2437
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2438
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2444
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2443
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2452
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2453
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2465
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2467
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2477
toil: bump grpc version by @dave-gray101 in #2480
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2487
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2493
deps(whisper): update, add libcufft-dev by @mudler in #2501
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2507
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2508
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2518
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2524
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2531
chore(deps): Update Dockerfile by @reneleonhardt in #2532
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2539
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2540
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2552
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2551
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2554
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2564
⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2565
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2570
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2575
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2584
⬆️ Update ggerganov/llama.cpp by @localai-bot in #2587

Other Changes

ci: fix sd release by @sozercan in #2400
ci(grpc-cache): also arm64 by @mudler in #2423
ci: push test images when building PRs by @mudler in #2424
ci: pin build-time protoc by @mudler in #2461
feat(swagger): update swagger by @localai-bot in #2464
ci: run release build on self-hosted runners by @mudler in #2505
experiment: -j4 for build-linux: by @dave-gray101 in #2514
test: e2e /reranker endpoint by @dave-gray101 in #2211
ci: pack less libs inside the binary by @mudler in #2579

New Contributors

@prajwalnayak7 made their first contribution in #2347
@reneleonhardt made their first contribution in #2532
@reid41 made their first contribution in #2535
@nwithan8 made their first contribution in #2574

Full Changelog: v2.16.0...v2.17.0