github YellowRoseCx/koboldcpp-rocm v1.85.yr0-ROCm
KoboldCPP-v1.85.yr0-ROCm

2 days ago

ROCm backend changes

  • This release will have 2 build files for you to try if one doesn't work for you, the only difference is in the GPU kernel files that are included
    • koboldcpp_rocm.exe will have been built with files more similar to how v1.79.yr1-ROCm was compiled
    • koboldcpp_rocm_b2.exe will have been built with the same files as the previous version
  • Support has been added for experimental HIPGraph usage. (Disabled by default, no performance increase yet.)
  • HIP virtual memory management added, but is disabled until upstream fixes are created. ggml-org#11405

koboldcpp-rocm-1.85

Now with 5% more kobo edition

  • New Features:

    • NEW: Added Server-Sided (networked) save slots! You can now specify a database file when launching KoboldCpp using --savedatafile. Then, you will be able to save and load persistent stories over the network to that KoboldCpp server, and access it from any other browser or device connected to it over the network. This can also be combined with --password to require an API key to save/load the stories.
    • Added the ability to switch models, settings and configs at runtime! This also allows for remote model swapping. Credits to @esolithe for original reference implementation.
      • Launch with --admin to enable this feature, and also provide --admindir containing .kcpps launch configs.
      • Optionally, provide --adminpassword to secure admin functions
      • You will be able to swap between any model's config at runtime from the Admin panel in Lite. You can prepare .kcpps configs for different layers, backends, models, etc.
      • KoboldCpp will then terminate the current instance and relaunch to a new config.
    • Added Top-N Sigma sampler (credit @EquinoxPsychosis). Note that this sampler can only be combined with Top-K, Temperature, and XTC.
    • Added --exportconfig, allowing users to export any set of launch arguments as a .kcpps config file from the command line. This file can also be used subsequently for model switching in admin mode.
    • Minor refactors for TFS and rep pen by @Reithan
    • CLIP vision embeddings can now be reused between multiple requests, so they won't have to be reprocessed if the images don't change.
    • Context shifting disabled when using mrope (used in Qwen2VL) as it does not work correctly.
    • Now defaults to AutoGuess for chat completions adapter. Set to "Alpaca" for the old behavior instead.
    • You can now set the maximum resolution accepted by vision mmprojs with --visionmaxres. Images larger than that will be downscaled before processing.
    • You can now set a length limit for TTS, using --ttsmaxlen when launching, this limits the number of TTS tokens allowed to be generated (range 512 to 4096). Each 1s of audio is about 75 tokens.
    • Added support for using aria2c and wget for model downloading if detected on system. (credits @henk717).
    • It's also now possible to specify multiple URLs when loading multipart models online with --model [url1] [url2]... (CLI only), which will allow KoboldCpp to download multiple model file URLs.
    • Added automatic recovery in admin mode if it fails when switching to a faulty config, it will attempt to rollback to the original known-good config.
    • Added cloudflared tunnel download for aarch64 (thanks @FlippFuzz). Also, allowed SSL combined with remote tunnels.
  • Kobold Lite

    • NEW: Added deepseek instruct template, and added support for reasoning/thinking template tags. You can configure thinking rendering behavior from Context > Tokens > Thinking
    • NEW: Finally allows specifying individual start and end instruct tags instead of combining them. Toggle this in Settings > Toggle End Tags.
    • NEW: Multi-pass websearch added. This allows you to specify a template that is used to generate the search query.
    • Added improved thinking support, display and allow forced injecting <think> tokens in AI replies or filtering out old thoughts in subsequent generations.
    • Reworked and improved load/save UI, added 2 extra local slots and 8 extra remote save slots.
    • Top-N sigma support
    • Added customization options for assistant jailbreak prompt
    • Refactored 3rd party scenario loader (thanks @Desaroll)
    • Fix websearch button visibility
    • Improved instruct formatting in classic UI
    • Fixed some LaTeX and markdown edge cases
    • Upped max length slider to 1024 if detected context is larger than 4096.
    • Added a websearch toggle button
    • TTS now allows downloading the audio output as a file when testing it, instead of just playing the sound.
    • Some regex parsing fixes
    • Added admin panel
    • Multiple other fixes and improvements
  • Fixes:

    • Merged fixes and improvements from upstream
    • Fixed .kcppt templates backend override not working
    • Updated clinfo binary for windows.
    • Fixed MoE experts override not working for Deepseek
    • Fixed multiple loader bugs when using the AutoGuess adapter.
    • Fixed images failing to generate when using the AutoGuess adapter.
    • Removed TTS caching as it was not very good.
    • Fixed a bug with TTS that could cause a crash.

Don't miss a new koboldcpp-rocm release

NewReleases is sending notifications on new releases.