mescon/Healarr v1.3.5 on GitHub

The headline is GPU hardware decoding actually works now. v1.3.4 shipped the infrastructure (custom ffmpeg with NVDEC enabled, NVIDIA Container Toolkit passthrough) but missed that -hwaccel auto alone doesn't route AV1 / VP9 / VP8 through the GPU — those codecs' default ffmpeg decoders (libdav1d, libvpx) have no internal hwaccel hooks, so the CUDA context was set up and immediately ignored. This release fixes that with codec-aware decoder selection, switches the base image so the NVIDIA Container Toolkit's library injection actually works, and adds a defensive retry-without-hwaccel safety net so a broken GPU runtime can never trigger mass file deletion. Verified live on an RTX 4070: 30 ffmpeg-on-GPU processes during a scan of an AV1 library, ~224 MiB GPU memory each, captured ffmpeg command line shows -c:v av1_cuvid.

Also fixes a three-bug zombie-scan chain that could resurrect cancelled scans across restarts, and a clipped notification provider dropdown.

Added

AV1, VP9, and VP8 thorough scans now actually engage the GPU on NVIDIA, Intel QSV, and VAAPI hosts (Intel/AMD) (#281). Healarr probes the input codec via a cheap ffprobe and adds a vendor-appropriate -c:v override that engages the actual hardware decoder. Vendor coverage: NVIDIA *_cuvid, Intel QSV *_qsv, and VAAPI (ffmpeg's bare internal decoder names + -hwaccel vaapi hooks). When HEALARR_HEALTH_CHECK_HWACCEL=auto, Healarr detects the vendor from /dev/nvidiactl (NVIDIA) or /dev/dri/renderD* (VAAPI fallback). H.264 / HEVC / MPEG2 / VC-1 don't need explicit overrides because their default decoders already have internal hwaccel hooks. Closes #276.

Changed

Docker base image switched from Alpine + custom-compiled ffmpeg to Debian-slim + jellyfin-ffmpeg (#278). On Alpine the NVIDIA Container Toolkit was failing to inject libcuda.so / libnvcuvid.so into the container despite runtime: nvidia being set — every cuvid decoder SIGSEGV'd in 1-2 seconds. The same toolkit works perfectly for jellyfin-ffmpeg on Debian (the build Tdarr uses), so we adopted it. ffmpeg version bumps to 8.1.1-Jellyfin; hwaccel list grows from cuda/vaapi/drm to cuda/vaapi/qsv/drm/opencl/vulkan. Image size grows ~371 MB → ~908 MB, the cost of a working out-of-the-box NVIDIA path.

Fixed

Thorough decode failures that look like a broken GPU runtime now retry the same file with hardware acceleration disabled instead of letting the failure flow into the corruption classifier (#278). The retry catches SIGSEGV, "Failed to setup hwaccel", "Device creation failed", "Cannot load libcuda", and similar decoder-init patterns. Without this guard, a misconfigured driver / missing userspace driver lib / decoder crash would route every affected file straight to the remediation pipeline (delete + re-grab from the *arr) — the silent mass-data-loss failure mode. Healarr now treats GPU-runtime breakage as a transient infrastructure issue, not as evidence of file corruption.
Notification provider dropdown no longer gets clipped by its parent card (#277). The dropdown was rendered inline with position: absolute, which respects ancestor overflow: hidden (set on the Settings card for rounded-corner clipping) and Framer Motion's height-animated container. Dropdown now renders via a React portal into document.body with position: fixed coordinates computed from the trigger's bounding rect, so it escapes all overflow contexts.
Cancelling a scan now actually stops the in-process scan loop, and previously-cancelled scans no longer come back as "running" after a restart (#275). Three composing bugs were producing zombie scans (see #274):
1. CancelScan looked up the in-memory scan map by the DB integer id, but the map is keyed by an internal UUID, so the in-memory ctx.cancel() never fired. The scan loop kept iterating files after the user clicked cancel — only the DB row was updated. CancelScan now also matches by ScanDBID, so the HTTP cancel properly signals the in-process scan; PauseScan and ResumeScan got the same fix.
2. ListInterrupted (the resume-at-startup query) did not filter out rows with completed_at set. A scan that was cancelled and later had its status overwritten to interrupted by a graceful shutdown was being resumed on the next startup, resurrecting the cancellation as status='running'. It now filters on completed_at IS NULL.
3. MarkCancelled / MarkOrphansCancelled used WHERE completed_at IS NULL as the guard against clobbering a real terminal state. That guard also blocked recovery of the inconsistent rows the other two bugs created, leaving them permanently stuck. The guard is now status NOT IN ('cancelled', 'completed', 'aborted') — same protection against the cancel-vs-completion race, but also catches the zombie rows.
Any installs that previously hit this state have rows like status IN ('running', 'enumerating', 'scanning', 'interrupted') AND completed_at IS NOT NULL in the scans table; the MarkOrphansCancelled startup hook will now clean them up automatically on the next restart.

Full Changelog: v1.3.4...v1.3.5