What's Changed
Bug fixes 🐛
Exciting New Features 🎉
- feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache by @mudler in #4329
- feat(template): read jinja templates from gguf files by @mudler in #4332
- feat: stream tokens usage by @mintyleaf in #4415
- feat(Dockerfile): allow to skip driver installation by @mudler in #4447
- feat(ui): path prefix support via HTTP header by @mgoltzsche in #4497
- feat(dowloader): resume partial downloads by @Saavrm26 in #4537
🧠 Models
- chore(model gallery): add rp-naughty-v1.0c-8b by @mudler in #4322
- chore(model gallery): add loki-v2.6-8b-1024k by @mudler in #4321
- chore(model gallery): add math-iio-7b-instruct by @mudler in #4323
- chore(model gallery): add llama-3.3-70b-instruct by @mudler in #4333
- chore(model gallery): add mn-chunky-lotus-12b by @mudler in #4337
- chore(model gallery): add virtuoso-small by @mudler in #4338
- chore(model gallery): add bio-medical-llama-3-8b by @mudler in #4339
- chore(model gallery): add qwen2.5-7b-homeranvita-nerdmix by @mudler in #4343
- chore(model gallery): add impish_mind_8b by @mudler in #4344
- chore(model gallery): add tulu-3.1-8b-supernova-smart by @mudler in #4347
- chore(model gallery): add qwen2.5-math-14b-instruct by @mudler in #4355
- chore(model gallery): add intellect-1-instruct by @mudler in #4356
- chore(model gallery): add b-nimita-l3-8b-v0.02 by @mudler in #4357
- chore(model gallery): add sailor2-1b-chat by @mudler in #4363
- chore(model gallery): add sailor2-8b-chat by @mudler in #4364
- chore(model gallery): add sailor2-20b-chat by @mudler in #4365
- chore(model gallery): add 72b-qwen2.5-kunou-v1 by @mudler in #4369
- chore(model gallery): add deepthought-8b-llama-v0.01-alpha by @mudler in #4370
- chore(model gallery): add l3.3-70b-euryale-v2.3 by @mudler in #4371
- chore(model gallery): add l3.3-ms-evayale-70b by @mudler in #4374
- chore(model gallery): add evathene-v1.3 by @mudler in #4375
- chore(model gallery): add hermes-3-llama-3.2-3b by @mudler in #4376
- chore(model gallery): add fusechat-gemma-2-9b-instruct by @mudler in #4379
- chore(model gallery): add fusechat-qwen-2.5-7b-instruct by @mudler in #4380
- chore(model gallery): add chronos-gold-12b-1.0 by @mudler in #4381
- fix: correct gallery/index.yaml by @godsey in #4384
- chore(model gallery): add fusechat-llama-3.2-3b-instruct by @mudler in #4386
- chore(model gallery): add fusechat-llama-3.1-8b-instruct by @mudler in #4387
- chore(model gallery): add neumind-math-7b-instruct by @mudler in #4388
- chore(model gallery): add naturallm-7b-instruct by @mudler in #4392
- chore(model gallery): add marco-o1-uncensored by @mudler in #4393
- chore(model gallery): add qwen2-7b-multilingual-rp by @mudler in #4394
- chore(model gallery): add qwq-lcot-7b-instruct by @mudler in #4419
- chore(model gallery): add llama-openreviewer-8b by @mudler in #4422
- chore(model gallery): add falcon3-1b-instruct by @mudler in #4423
- chore(model gallery): add falcon3-3b-instruct by @mudler in #4424
- chore(model gallery): add qwen2-vl-72b-instruct by @mudler in #4425
- chore(model gallery): add falcon3-10b-instruct by @mudler in #4426
- chore(model gallery): add llama-song-stream-3b-instruct by @mudler in #4431
- chore(model gallery): add llama-chat-summary-3.2-3b by @mudler in #4432
- chore(model gallery): add tq2.5-14b-aletheia-v1 by @mudler in #4440
- chore(model gallery): add tq2.5-14b-neon-v1 by @mudler in #4441
- chore(model gallery): add orca_mini_v8_1_70b by @mudler in #4444
- chore(model gallery): add anubis-70b-v1 by @mudler in #4446
- chore(model gallery): add llama-3.3-70b-instruct-ablated by @mudler in #4448
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #4487
- chore(model gallery): add l3.3-ms-evalebis-70b by @mudler in #4488
- chore(model gallery): add tqwendo-36b by @mudler in #4489
- chore(model gallery): add rombos-llm-70b-llama-3.3 by @mudler in #4490
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #4492
- chore(model gallery): add fastllama-3.2-1b-instruct by @mudler in #4493
- chore(model gallery): add dans-personalityengine-v1.1.0-12b by @mudler in #4494
- chore(model gallery): add llama-3.1-8b-open-sft by @mudler in #4495
- chore(model gallery): add qvq-72b-preview by @mudler in #4498
- chore(model gallery): add teleut-7b-rp by @mudler in #4499
- chore(model gallery): add falcon3-1b-instruct-abliterated by @mudler in #4501
- chore(model gallery): add falcon3-3b-instruct-abliterated by @mudler in #4502
- chore(model gallery): add falcon3-10b-instruct-abliterated by @mudler in #4503
- chore(model gallery): add falcon3-7b-instruct-abliterated by @mudler in #4504
- chore(model gallery): add control-nanuq-8b by @mudler in #4506
- chore(model gallery): add miscii-14b-1028 by @mudler in #4507
- chore(model gallery): add miscii-14b-1225 by @mudler in #4508
- chore(model gallery): add qwen2.5-32b-rp-ink by @mudler in #4517
- chore(model gallery): add huatuogpt-o1-8b by @mudler in #4518
- chore(model gallery): add q2.5-veltha-14b-0.5 by @mudler in #4519
- chore(model gallery): add smallthinker-3b-preview by @mudler in #4521
- chore(model gallery): add mn-12b-mag-mell-r1-iq-arm-imatrix by @mudler in #4522
- chore(model gallery): add captain-eris-diogenes_twilight-v0.420-12b by @mudler in #4523
- chore(model gallery): add violet_twilight-v0.2 by @mudler in #4524
- chore(model gallery): add qwenwify2.5-32b-v4.5 by @mudler in #4525
- chore(model gallery): add sainemo-remix by @mudler in #4526
- chore(model gallery): add l3.1-purosani-2-8b by @mudler in #4527
- chore(model gallery): add nera_noctis-12b by @mudler in #4530
- chore(model gallery): add drt-o1-7b by @mudler in #4533
- chore(model gallery): add codepy-deepthink-3b by @mudler in #4534
- chore(model gallery): add llama3.1-8b-prm-deepseek-data by @mudler in #4535
- chore(model gallery): add experimental-lwd-mirau-rp-14b-iq-imatrix by @mudler in #4539
- chore(model gallery): add llama-deepsync-3b by @mudler in #4540
- chore(model gallery): add qwentile2.5-32b-instruct by @mudler in #4541
- chore(model gallery): add 32b-qwen2.5-kunou-v1 by @mudler in #4545
- chore(model gallery): add triangulum-10b by @mudler in #4546
- chore(model gallery): add 14b-qwen2.5-kunou-v1 by @mudler in #4547
- chore(model gallery): add dolphin3.0-llama3.1-8b by @mudler in #4553
- chore(model gallery): add dolphin3.0-llama3.2-1b by @mudler in #4554
- chore(model gallery): add dolphin3.0-llama3.2-3b by @mudler in #4555
- chore(model gallery): add dolphin3.0-qwen2.5-0.5b by @mudler in #4558
- chore(model gallery): add dolphin3.0-qwen2.5-1.5b by @mudler in #4559
- chore(model gallery): add dolphin3.0-qwen2.5-3b by @mudler in #4560
- chore(model gallery): add phi-4 by @mudler in #4562
- chore(model gallery): add 14b-qwen2.5-freya-x1 by @mudler in #4566
- chore(model gallery): add minithinky-v2-1b-llama-3.2 by @mudler in #4567
- chore(model gallery): add huatuogpt-o1-7b-v0.1 by @mudler in #4568
- chore(model gallery): add 70b-l3.3-cirrus-x1 by @mudler in #4569
- chore(model gallery): add gwq-9b-preview2 by @mudler in #4572
- chore(model gallery): add chuluun-qwen2.5-72b-v0.01 by @mudler in #4573
- chore(model gallery): add phi-3.5-moe-instruct by @mudler in #4574
📖 Documentation and examples
- chore(docs): update available backends by @mudler in #4325
- chore(docs): patch p2p detail in env and docs by @jtwolfe in #4434
- docs: update compatibility-table.md by @mudler in #4557
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp to
59f4db10883a4f3e855cffbf2c3ab68430e95272
by @localai-bot in #4319 - chore: ⬆️ Update leejet/stable-diffusion.cpp to
9578fdcc4632dc3de5565f28e2fb16b7c18f8d48
by @localai-bot in #4320 - chore: ⬆️ Update ggerganov/llama.cpp to
c9c6e01daedac542b174c235872569fce5385982
by @localai-bot in #4328 - chore: ⬆️ Update ggerganov/llama.cpp to
c5ede3849fc021174862f9c0bf8273808d8f0d39
by @localai-bot in #4330 - chore: ⬆️ Update ggerganov/llama.cpp to
3573fa8e7b7f0865638b52b4e9b4d2006f0558a2
by @localai-bot in #4335 - chore: ⬆️ Update ggerganov/llama.cpp to
e52522b8694ae73abf12feb18d29168674aa1c1b
by @localai-bot in #4342 - chore(deps): Bump docs/themes/hugo-theme-relearn from
be85052
tobd1f3d3
by @dependabot in #4348 - chore: ⬆️ Update ggerganov/llama.cpp to
26a8406ba9198eb6fdd8329fa717555b4f77f05f
by @localai-bot in #4353 - chore: ⬆️ Update ggerganov/llama.cpp to
dafae66cc242eb766797194d3c85c5e502625623
by @localai-bot in #4360 - chore: ⬆️ Update ggerganov/llama.cpp to
235f6e14bf0ed0211c51aeff14139038ae1000aa
by @localai-bot in #4366 - chore: ⬆️ Update ggerganov/llama.cpp to
274ec65af6e54039eb95cb44904af5c945dca1fa
by @localai-bot in #4372 - feat(llama.cpp): bump and adapt to upstream changes by @mudler in #4378
- chore: ⬆️ Update ggerganov/llama.cpp to
e52aba537a34d51a65cddec6bc6dafc9031edc63
by @localai-bot in #4385 - chore: ⬆️ Update ggerganov/llama.cpp to
a0974156f334acf8af5858d7ede5ab7d7490d415
by @localai-bot in #4391 - chore(llama.cpp): bump, drop penalize_nl by @mudler in #4418
- chore: ⬆️ Update ggerganov/llama.cpp to
081b29bd2a3d91e7772e3910ce223dd63b8d7d26
by @localai-bot in #4421 - chore: ⬆️ Update ggerganov/llama.cpp to
0bf2d10c5514ff61b99897a4a5054f846e384e1e
by @localai-bot in #4429 - chore: ⬆️ Update ggerganov/llama.cpp to
cd920d0ac38ec243605a5a57c50941140a193f9e
by @localai-bot in #4433 - chore: ⬆️ Update ggerganov/llama.cpp to
d408bb9268a988c5a60a5746d3a6430386e7604d
by @localai-bot in #4437 - chore: ⬆️ Update ggerganov/llama.cpp to
eb5c3dc64bd967f2e23c87d9dec195f45468de60
by @localai-bot in #4442 - chore: ⬆️ Update ggerganov/llama.cpp to
5cd85b5e008de2ec398d6596e240187d627561e3
by @localai-bot in #4445 - chore: ⬆️ Update ggerganov/llama.cpp to
ebdee9478ca7ba65497b9b96f7457698c6ee5115
by @localai-bot in #4451 - chore(deps): Bump docs/themes/hugo-theme-relearn from
bd1f3d3
toec88e24
by @dependabot in #4460 - chore: ⬆️ Update ggerganov/llama.cpp to
32d6ee6385b3fc908b283f509b845f757a6e7206
by @localai-bot in #4486 - chore: ⬆️ Update ggerganov/llama.cpp to
2cd43f4900ba0e34124fdcbf02a7f9df25a10a3d
by @localai-bot in #4491 - chore: ⬆️ Update ggerganov/llama.cpp to
9ba399dfa7f115effc63d48e6860a94c9faa31b2
by @localai-bot in #4496 - chore: ⬆️ Update ggerganov/llama.cpp to
d79d8f39b4da6deca4aea8bf130c6034c482b320
by @localai-bot in #4500 - chore: ⬆️ Update ggerganov/llama.cpp to
f865ea149d71ef883e3780fced8a20a1464eccf4
by @localai-bot in #4510 - chore: ⬆️ Update ggerganov/llama.cpp to
a813badbbdf0d38705f249df7a0c99af5cdee678
by @localai-bot in #4512 - chore(deps): Bump gradio from 3.48.0 to 5.9.1 in /backend/python/openvoice by @dependabot in #4514
- chore: ⬆️ Update leejet/stable-diffusion.cpp to
dcf91f9e0f2cbf9da472ee2a556751ed4bab2d2a
by @localai-bot in #4509 - chore: ⬆️ Update ggerganov/llama.cpp to
716bd6dec3e044e5c325386b5b0483392b24cefe
by @localai-bot in #4516 - chore(deps): Bump docs/themes/hugo-theme-relearn from
ec88e24
tod25f856
by @dependabot in #4515 - chore: ⬆️ Update ggerganov/llama.cpp to
0827b2c1da299805288abbd556d869318f2b121e
by @localai-bot in #4520 - chore: ⬆️ Update ggerganov/llama.cpp to
2f0ee84b9b02d2a98742308026f060ebdc2423f1
by @localai-bot in #4528 - chore(deps): bump llama.cpp to 4b0c638b9 by @mudler in #4532
- chore: ⬆️ Update ggerganov/llama.cpp to
9394bbd484f802ce80d2858033583af3ef700d25
by @localai-bot in #4536 - chore(deps): bump grpcio to 1.69.0 by @mudler in #4543
- chore: ⬆️ Update ggerganov/llama.cpp to
b56f079e28fda692f11a8b59200ceb815b05d419
by @localai-bot in #4544 - chore(deps): Bump docs/themes/hugo-theme-relearn from
d25f856
to80e448e
by @dependabot in #4549 - chore: ⬆️ Update ggerganov/llama.cpp to
ecebbd292d741ac084cf248146b2cfb17002aa1d
by @localai-bot in #4552 - chore: ⬆️ Update ggerganov/llama.cpp to
53ff6b9b9fb25ed0ec0a213e05534fe7c3d0040f
by @localai-bot in #4556 - chore: ⬆️ Update ggerganov/llama.cpp to
8d59d911711b8f1ba9ec57c4b192ccd2628af033
by @localai-bot in #4561 - chore(deps): bump edgevpn to v0.29.0 by @mudler in #4564
- chore: ⬆️ Update ggerganov/llama.cpp to
1204f9727005974587d6fc1dcd4d4f0ead87c856
by @localai-bot in #4570 - chore: ⬆️ Update ggerganov/llama.cpp to
ba8a1f9c5b675459c55a83e3f97f10df3a66c788
by @localai-bot in #4575
Other Changes
- Updated links of yamls by @PetrFlegr in #4324
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #4327
- Revert "feat: include tokens usage for streamed output" by @mudler in #4336
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #4341
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #4359
- fix(python): remove pin to setuptools, pin python version by @mudler in #4395
- chore(tests): stabilize tts test by @mudler in #4417
- fix(intel): pin torch and intel-extensions by @mudler in #4435
- fix(deps): pin openvoice pytorch/torchaudio by @mudler in #4436
- fix(openvoice): do not pin numpy by @mudler in #4438
- fix(openvoice): pin numpy before installing torch by @mudler in #4439
- chore(nvidia-l4t): add l4t arm64 images by @mudler in #4449
- chore(ci): comment arm64 job until we find a native CI runner by @mudler in #4452
- chore(docs): add nvidia l4t instructions by @mudler in #4454
- chore: update labeler.yml to include go files by @mudler in #4565
New Contributors
- @PetrFlegr made their first contribution in #4324
- @godsey made their first contribution in #4384
- @mgoltzsche made their first contribution in #4497
- @Saavrm26 made their first contribution in #4537
Full Changelog: v2.24.0...v2.25.0