p-e-w/heretic v1.3.0 on GitHub

Changes

@Vinay-Umrethe (who had previously contributed under the username @Vinayyyy7) implemented reproducible runs in #191. @p-e-w revised and improved that implementation in #303.
@magiccodingman reduced peak VRAM usage in #239. @olekssy fixed a bug in that implementation in #301.
@farolone added support for Qwen3.5 models in #187
@MoonRide303 added support for Gemma 4 models in #287
@erm14254 made sure all abliterable components across layers are displayed in #215
@cpagac fixed VRAM usage reporting for multi-GPU setups in #169
@cpagac fixed a division-by-zero error in the evaluator in #225
@spikymoth improved automatic response prefix determination with a two-step process in #194
@spikymoth added model card generation for local models with an existing README in #157
@Diplo2by improved startup speed when Heretic is run with -h/--help in #293
@AWuhrmann fixed the example value for the max_memory setting in #284
@p-e-w added an integrated benchmarking system, made the response prefix logic configurable, implemented multiple infrastructure improvements, and fixed various minor issues

Full Changelog: v1.2.0...v1.3.0