github lemonade-sdk/lemonade v10.8.1

6 hours ago

Headline

  • Speculative decoding now accepts draft/MTP/EAGLE3 checkpoints, with new Gemma-4 MTP models ready to pull.
  • ROCm installation and GPU detection are restored for Radeon RX RDNA2/3/4 dGPUs on Windows and Linux.
  • Backend installs are now crash-safe, with resilient downloads that fall back gracefully when a release lookup or cache snapshot is unavailable.
  • lemonade bench --response-log captures each model's responses and run metadata to a JSONL file for later quality evaluation.
  • The lemonade backends command now lists only supported recipes and backends by default; use lemonade backends --all to see every available option.

Breaking Changes

  • The --model-draft, -md, and --spec-draft-model flags are now reserved for internal speculation-decoding support and can no longer be passed manually through llamacpp_args.

Lemonade Server

Operating System Downloads
Windows lemonade.msi
Ubuntu 24.04+ Launchpad PPA
Debian 13 lemonade-server_10.8.1-debian13_amd64.deb
Fedora 43 lemonade-server-10.8.1-fc43.x86_64.rpm
Fedora 44 lemonade-server-10.8.1-fc44.x86_64.rpm
macOS Lemonade-10.8.1-Darwin.pkg

Other platforms? See our Installation Options for Docker, Snap, Arch, Debian, and more.

Embeddable Lemonade

Portable binaries for bundling into your own installer. Run lemond ./ as a subprocess.

Platform Download
Ubuntu x64 lemonade-embeddable-10.8.1-ubuntu-x64.tar.gz
Windows x64 lemonade-embeddable-10.8.1-windows-x64.zip
macOS arm64 lemonade-embeddable-10.8.1-macos-arm64.tar.gz

What's Changed

Thanks @GabrielReusRodriguez, @Kushal1213, @Phqen1x, @abn, @bitgamma, @blackdeathdrow, @ckuethe, @fl0rianr, @github-actions, @ianbmacdonald, @jeremyfowers, @jtlayton, @kenvandine, @lucifer-vali, @matthewjhunter, @ramkrishna2910, @sagebind, @superm1 for your awesome contributions to this release!

Click to expand changelog
  • ci: support tagging releases from a release branch by @jeremyfowers in #2272
  • ci: add repo-manager workflow by @jeremyfowers in #2276
  • fix(whisper): drop gfx103X from rocm whisper supported archs by @ramkrishna2910 in #2274
  • test(gguf): add unit tests for MTP / capability label detection (#2176) by @ramkrishna2910 in #2281
  • fix(rocm): accept wildcard GPU arch families in TheRock install gate (#2093 follow-up) by @ramkrishna2910 in #2280
  • fix(test): require Whisper model load before language transcription test by @fl0rianr in #2278
  • docs(cli): correct launch note - LEMONADE_* recipe env vars no longer honored by @ramkrishna2910 in #2275
  • ci(repo-manager): automatic repo-manager fully operational by @jeremyfowers in #2285
  • docs(release): update release process for repo-manager automation by @jeremyfowers in #2288
  • fix(ci): restart MacOS server before whisper metal tests by @fl0rianr in #2303
  • auto update and validate sd-cpp by @fl0rianr in #2139
  • fix(macOS): updates whisper from v1.8.4 to v.1.8.5 for metal by @fl0rianr in #2309
  • Fix Problem with .devcontainer. It did not copy .devcontainer/reinstall-cmake.sh when building container by @GabrielReusRodriguez in #2273
  • Add support for additional draft checkpoint by @bitgamma in #2317
  • docs: fix Debian 13 installation docs for issue #2299 by @superm1 in #2323
  • docs: add documentation style guide for community and AI-assisted contributions by @kenvandine in #2054
  • Fix copy-to-clipboard buttons silently failing in web-app over HTTP by @blackdeathdrow in #2260
  • Fix(cli): Lemonade backends now shows only supported backends . Added --all option to show all backends. by @GabrielReusRodriguez in #2254
  • fix(gpu): support wildcards in GPU detection logic by @jtlayton in #2295
  • Fix ROCm whisper-server startup: add TheRock lib dir to LD_LIBRARY_PATH by @matthewjhunter in #2293
  • Capture output from lemonade bench by @ckuethe in #2214
  • fix(windows): use ProcessManager::run_command for 7z extraction instead of system() Fixes #2313 by @Phqen1x in #2322
  • fix(server): fall back to installed llama.cpp binary when "latest" release lookup fails by @ianbmacdonald in #2279
  • fix(moonshine): return 400 on invalid audio by @abn in #2326
  • fix(backends): stage and verify backend install before removing the working binary by @ianbmacdonald in #2315
  • docs: system-stats API by @jeremyfowers in #2284
  • fix: return 400 instead of 500 when request body is empty or not valid JSON by @Kushal1213 in #2232
  • devcontainers - Create the python env and install reqs to allow python test execution by @GabrielReusRodriguez in #2336
  • systemd: add user-service symlink at /usr/lib/systemd/user/ by @lucifer-vali in #2173
  • fix: resolve shared-repo GGUF variants orphaned by refs/main advance by @ianbmacdonald in #2311
  • Fix: eviction_engine does not run nvidia-smi on AMD configs anymore and some testing problems fixed. by @GabrielReusRodriguez in #2331
  • Update stable-diffusion.cpp to master-709-92a3b73 by @github-actions[bot] in #2335
  • Support image[] parameter for /v1/images/edits by @sagebind in #2321
  • Update llama.cpp to b9747 by @github-actions[bot] in #2333
  • Better auto context size estimate by @bitgamma in #2337
  • Version bump to 10.8.1 to prepare for the release by @kenvandine in #2374

New Contributors

Full Changelog: v10.8.0...v10.8.1


Windows installers are signed. Free code signing provided by SignPath.io, certificate by SignPath Foundation. See our Code Signing Policy.

Don't miss a new lemonade release

NewReleases is sending notifications on new releases.