Ahoj! this new release of LocalAI comes with tons of updates, and enhancements behind the scenes!
🌟 Highlights TLDR;
- Automatic identification of GGUF models
- New WebUI page to talk with an LLM!
- https://models.localai.io is live! 🚀
- Better arm64 and Apple silicon support
- More models to the gallery!
- New quickstart installer script
- Enhancements to mixed grammar support
- Major improvements to transformers
- Linux single binary now supports rocm, nvidia, and intel
🤖 Automatic model identification for llama.cpp-based models
Just drop your GGUF files into the model folders, and let LocalAI handle the configurations. YAML files are now reserved for those who love to tinker with advanced setups.
🔊 Talk to your LLM!
Introduced a new page that allows direct interaction with the LLM using audio transcription and TTS capabilities. This feature is so fun - now you can just talk with any LLM with a couple of clicks away.
🍏 Apple single-binary
Experience enhanced support for the Apple ecosystem with a comprehensive single-binary that packs all necessary libraries, ensuring LocalAI runs smoothly on MacOS and ARM64 architectures.
ARM64
Expanded our support for ARM64 with new Docker images and single binary options, ensuring better compatibility and performance on ARM-based systems.
Note: currently we support only arm core images, for instance: localai/localai:master-ffmpeg-core
, localai/localai:latest-ffmpeg-core
, localai/localai:v2.17.0-ffmpeg-core
.
🐞 Bug Fixes and small enhancements
We’ve ironed out several issues, including image endpoint response types and other minor problems, boosting the stability and reliability of our applications. It is now also possible to enable CSRF when starting LocalAI, thanks to @dave-gray101.
🌐 Models and Galleries
Enhanced the model gallery with new additions like Mirai Nova, Mahou, and several updates to existing models ensuring better performance and accuracy.
Now you can check new models also in https://models.localai.io, without running LocalAI!
Installation and Setup
A new install.sh script is now available for quick and hassle-free installations, streamlining the setup process for new users.
curl https://localai.io/install.sh | sh
Installation can be configured with Environment variables, for example:
curl https://localai.io/install.sh | VAR=value sh
List of the Environment Variables:
- DOCKER_INSTALL: Set to "true" to enable the installation of Docker images.
- USE_AIO: Set to "true" to use the all-in-one LocalAI Docker image.
- API_KEY: Specify an API key for accessing LocalAI, if required.
- CORE_IMAGES: Set to "true" to download core LocalAI images.
- PORT: Specifies the port on which LocalAI will run (default is 8080).
- THREADS: Number of processor threads the application should use. Defaults to the number of logical cores minus one.
- VERSION: Specifies the version of LocalAI to install. Defaults to the latest available version.
- MODELS_PATH: Directory path where LocalAI models are stored (default is /usr/share/local-ai/models).
We are looking into improving the installer, and as this is a first iteration any feedback is welcome! Open up an issue if something doesn't work for you!
Enhancements to mixed grammar support
Mixed grammar support continues receiving improvements behind the scenes.
🐍 Transformers backend enhancements
- Temperature = 0 correctly handled as greedy search
- Handles custom words as stop words
- Implement KV cache
- Phi 3 no more requires
trust_remote_code: true
flag
Shout-out to @fakezeta for these enhancements!
Install models with the CLI
Now the CLI can install models directly from the gallery. For instance:
local-ai run <model_name_in gallery>
This command ensures the model is installed in the model folder at startup.
🐧 Linux single binary now supports rocm, nvidia, and intel
Single binaries for Linux now contain Intel, AMD GPU, and NVIDIA support. Note that you need to install the dependencies separately in the system to leverage these features. In upcoming releases, this requirement will be handled by the installer script.
📣 Let's Make Some Noise!
A gigantic THANK YOU to everyone who’s contributed—your feedback, bug squashing, and feature suggestions are what make LocalAI shine. To all our heroes out there supporting other users and sharing their expertise, you’re the real MVPs!
Remember, LocalAI thrives on community support—not big corporate bucks. If you love what we're building, show some love! A shoutout on social (@LocalAI_OSS and @mudler_it on twitter/X), joining our sponsors, or simply starring us on GitHub makes all the difference.
Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy
Thanks a ton, and.. enjoy this release!
What's Changed
Bug fixes 🐛
- fix: gpu fetch device info by @sozercan in #2403
- fix(watcher): do not emit fatal errors by @mudler in #2410
- fix: install pytorch from proper index for hipblas builds by @cryptk in #2413
- fix: pin version of setuptools for intel builds to work around #2406 by @cryptk in #2414
- bugfix: CUDA acceleration not working by @fakezeta in #2475
- fix:
pkg/downloader
should respect basePath forfile://
urls by @dave-gray101 in #2481 - fix: chat webui response parsing by @sozercan in #2515
- fix(stream): do not break channel consumption by @mudler in #2517
- fix(Makefile): enable STATIC on dist by @mudler in #2569
Exciting New Features 🎉
- feat(images): do not install python deps in the core image by @mudler in #2425
- feat(hipblas): extend default hipblas GPU_TARGETS by @mudler in #2426
- feat(build): add arm64 core containers by @mudler in #2421
- feat(functions): allow parallel calls with mixed/no grammars by @mudler in #2432
- feat(image): support
response_type
in the OpenAI API request by @prajwalnayak7 in #2347 - feat(swagger): update swagger by @localai-bot in #2436
- feat(functions): better free string matching, allow to expect strings after JSON by @mudler in #2445
- build(Makefile): add back single target to build native llama-cpp by @mudler in #2448
- feat(functions): allow
response_regex
to be a list by @mudler in #2447 - TTS API improvements by @blob42 in #2308
- feat(transformers): various enhancements to the transformers backend by @fakezeta in #2468
- feat(webui): enhance card visibility by @mudler in #2473
- feat(default): use number of physical cores as default by @mudler in #2483
- feat: fiber CSRF by @dave-gray101 in #2482
- feat(amdgpu): try to build in single binary by @mudler in #2485
- feat:
OpaqueErrors
to hide error information by @dave-gray101 in #2486 - build(intel): bundle intel variants in single-binary by @mudler in #2494
- feat(install): add install.sh for quick installs by @mudler in #2489
- feat(llama.cpp): guess model defaults from file by @mudler in #2522
- feat(ui): add page to talk with voice, transcription, and tts by @mudler in #2520
- feat(arm64): enable single-binary builds by @mudler in #2490
- feat(util): add util command to print GGUF informations by @mudler in #2528
- feat(defaults): add defaults for Command-R models by @mudler in #2529
- feat(detection): detect by template in gguf file, add qwen2, phi, mistral and chatml by @mudler in #2536
- feat(gallery): show available models in website, allow
local-ai models install
to install from galleries by @mudler in #2555 - feat(gallery): uniform download from CLI by @mudler in #2559
- feat(guesser): identify gemma models by @mudler in #2561
- feat(binary): support extracted bundled libs on darwin by @mudler in #2563
- feat(darwin): embed grpc libs by @mudler in #2567
- feat(build): bundle libs for arm64 and x86 linux binaries by @mudler in #2572
- feat(libpath): refactor and expose functions for external library paths by @mudler in #2578
🧠 Models
- models(gallery): add Mirai Nova by @mudler in #2405
- models(gallery): add Mahou by @mudler in #2411
- models(gallery): add minicpm by @mudler in #2412
- models(gallery): add poppy porpoise 0.85 by @mudler in #2415
- models(gallery): add alpha centauri by @mudler in #2416
- models(gallery): add cream-phi-13b by @mudler in #2417
- models(gallery): add stheno-mahou by @mudler in #2418
- models(gallery): add iterative-dpo, fix minicpm by @mudler in #2422
- models(gallery): add una-thepitbull by @mudler in #2435
- models(gallery): add halu by @mudler in #2434
- models(gallery): add neuraldaredevil by @mudler in #2439
- models(gallery): add Codestral by @mudler in #2442
- models(gallery): add mopeymule by @mudler in #2449
- models(gallery): ⬆️ update checksum by @localai-bot in #2451
- models(gallery): add anjir by @mudler in #2454
- models(gallery): add llama3-11b by @mudler in #2455
- models(gallery): add ultron by @mudler in #2456
- models(gallery): add poppy porpoise 1.0 by @mudler in #2459
- models(gallery): add Neural SOVLish Devil by @mudler in #2460
- models(gallery): add all whisper variants by @mudler in #2462
- models(gallery): ⬆️ update checksum by @localai-bot in #2463
- models(gallery): add gemma-2b by @mudler in #2466
- models(gallery): add fimbulvetr iqmatrix version by @mudler in #2470
- models(gallery): add new poppy porpoise versions by @mudler in #2471
- models(gallery): add dolphin-2.9.2-Phi-3-Medium by @mudler in #2492
- models(gallery): add dolphin-2.9.2-phi-3-Medium-abliterated by @mudler in #2495
- models(gallery): add nyun by @mudler in #2496
- models(gallery): add phi-3-4x4b by @mudler in #2497
- models(gallery): add llama-3-instruct-8b-SimPO-ExPO by @mudler in #2498
- models(gallery): add Llama-3-Yggdrasil-2.0-8B by @mudler in #2499
- models(gallery): add l3-8b-stheno-v3.2-iq-imatrix by @mudler in #2500
- models(gallery): add llama3-8B-aifeifei-1.0-iq-imatrix by @mudler in #2509
- models(gallery): add rawr_llama3_8b-iq-imatrix by @mudler in #2510
- models(gallery): add llama3-8b-feifei-1.0-iq-imatrix by @mudler in #2511
- models(gallery): ⬆️ update checksum by @localai-bot in #2519
- models(gallery): add llama3-8B-aifeifei-1.2-iq-imatrix by @mudler in #2544
- models(gallery): add hathor-l3-8b-v.01-iq-imatrix by @mudler in #2545
- models(gallery): add l3-aethora-15b by @mudler in #2546
- models(gallery): add llama-salad-8x8b by @mudler in #2547
- models(gallery): add average_normie_v3.69_8b-iq-imatrix by @mudler in #2548
- models(gallery): add duloxetine by @mudler in #2549
- models(gallery): add badger-lambda-llama-3-8b by @mudler in #2550
- models(gallery): add firefly-gemma-7b by @mudler in #2576
- models(gallery): add dolphin-qwen by @mudler in #2580
- models(gallery): add tess-v2.5-phi-3-medium-128k-14b by @mudler in #2581
- models(gallery): add hathor_stable-v0.2-l3-8b by @mudler in #2582
- models(gallery): add samantha-qwen2 by @mudler in #2586
- models(gallery): add gemma-1.1-7b-it by @mudler in #2588
📖 Documentation and examples
- Update quickstart.md by @mudler in #2404
- docs: fix p2p commands by @mudler in #2472
- README: update sponsors list by @mudler in #2476
- Add integrations by @reid41 in #2535
- docs(gallery): lazy-load images by @mudler in #2557
- Fix standard image latest Docker tags by @nwithan8 in #2574
👒 Dependencies
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2399
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #2398
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2408
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2409
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2419
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2427
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2428
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2433
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2437
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2438
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2444
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2443
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2452
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2453
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2465
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2467
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2477
- toil: bump grpc version by @dave-gray101 in #2480
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2487
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2493
- deps(whisper): update, add libcufft-dev by @mudler in #2501
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2507
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2508
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2518
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2524
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2531
- chore(deps): Update Dockerfile by @reneleonhardt in #2532
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2539
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2540
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2552
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2551
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2554
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2564
- ⬆️ Update ggerganov/whisper.cpp by @localai-bot in #2565
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2570
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2575
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2584
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #2587
Other Changes
- ci: fix sd release by @sozercan in #2400
- ci(grpc-cache): also arm64 by @mudler in #2423
- ci: push test images when building PRs by @mudler in #2424
- ci: pin build-time protoc by @mudler in #2461
- feat(swagger): update swagger by @localai-bot in #2464
- ci: run release build on self-hosted runners by @mudler in #2505
- experiment:
-j4
forbuild-linux:
by @dave-gray101 in #2514 - test: e2e /reranker endpoint by @dave-gray101 in #2211
- ci: pack less libs inside the binary by @mudler in #2579
New Contributors
- @prajwalnayak7 made their first contribution in #2347
- @reneleonhardt made their first contribution in #2532
- @reid41 made their first contribution in #2535
- @nwithan8 made their first contribution in #2574
Full Changelog: v2.16.0...v2.17.0