KoboldCPP-v1.55.yr0-ROCm

Added Dynamic Temperature (DynaTemp), which is specified by a Temperature Value and a Temperature Range (Credits: @kalomaze). When used, the actual temperature is allowed to be automatically adjusted dynamically between DynaTemp ± DynaTempRange. For example, setting temperature=0.4 and dynatemp_range=0.1 will result in a minimum temp of 0.3 and max of 0.5. For ease of use, a UI to select min and max temperature for dynatemp directly is also provided in Lite. Both inputs will work and auto update the other.
Try to reuse cloudflared file when running remote tunnel, but also handle if cloudflared fails to download correctly.
Added a field to show the most recently used seed in the perf endpoint
Switched cuda pool malloc back to the old implementation
Updated Lite, added support for DynaTemp
Merged new improvements and fixes from upstream llama.cpp
Various minor fixes.

To use on Windows, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller OR download koboldcpp_rocm_files.zip and run python koboldcpp.py (additional python pip modules might need installed, like customtkinter and tk or python-tk.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4 (-j4 can be adjusted to your number of CPU threads for faster build times)
For a full Linux build, make sure you have the OpenBLAS and CLBlast packages installed:
For Arch Linux: Install cblas openblas and clblast.
For Debian: Install libclblast-dev and libopenblas-dev.
then run make LLAMA_HIPBLAS=1 LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4

If you're using NVIDIA, you can try koboldcpp.exe at LostRuin's upstream repo here
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller, also at LostRuin's repo.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4

YellowRoseCx/koboldcpp-rocm v1.55.yr0-ROCm KoboldCPP-v1.55.yr0-ROCm on GitHub

KoboldCPP-v1.55.yr0-ROCm

YellowRoseCx/koboldcpp-rocm v1.55.yr0-ROCm
KoboldCPP-v1.55.yr0-ROCm

on GitHub