Changes
- @Vinay-Umrethe (who had previously contributed under the username @Vinayyyy7) implemented reproducible runs in #191. @p-e-w revised and improved that implementation in #303.
- @magiccodingman reduced peak VRAM usage in #239. @olekssy fixed a bug in that implementation in #301.
- @farolone added support for Qwen3.5 models in #187
- @MoonRide303 added support for Gemma 4 models in #287
- @erm14254 made sure all abliterable components across layers are displayed in #215
- @cpagac fixed VRAM usage reporting for multi-GPU setups in #169
- @cpagac fixed a division-by-zero error in the evaluator in #225
- @spikymoth improved automatic response prefix determination with a two-step process in #194
- @spikymoth added model card generation for local models with an existing README in #157
- @Diplo2by improved startup speed when Heretic is run with
-h/--helpin #293 - @AWuhrmann fixed the example value for the
max_memorysetting in #284 - @p-e-w added an integrated benchmarking system, made the response prefix logic configurable, implemented multiple infrastructure improvements, and fixed various minor issues
New Contributors
- @cpagac made their first contribution in #169
- @farolone made their first contribution in #187
- @erm14254 made their first contribution in #215
- @AWuhrmann made their first contribution in #284
- @MoonRide303 made their first contribution in #287
- @Diplo2by made their first contribution in #293
- @magiccodingman made their first contribution in #239
- @olekssy made their first contribution in #301
Full Changelog: v1.2.0...v1.3.0