murtaza-nasir/speakr v0.8.11-alpha on GitHub

Video Retention, Parallel Uploads & Speaker Improvements

Big feature release. Video playback, faster uploads, duplicate detection, and a bunch of speaker identification improvements.

New Features

Video Retention - Set VIDEO_RETENTION=true to keep video files for in-browser playback instead of extracting audio and throwing away the video. Great for lectures, presentations, and screen recordings. Audio gets extracted to a temp file for transcription and cleaned up after. Seeking works properly via HTTP Range requests. There's a collapsible toggle in the player controls so you can hide the video when you just want the transcript.
Parallel Uploads - Files now upload concurrently instead of one at a time. Makes batch uploads way faster. Control how many go at once with MAX_CONCURRENT_UPLOADS (default 3).
Duplicate Detection - SHA-256 hashing on upload catches duplicate files. You get a warning toast and a clickable indicator in the sidebar showing existing copies.
Speaker API Endpoints - REST API for speaker identification and assignment with bearer token auth. Extracted the identification logic into a shared service so the API and UI use the same code.
Speaker Identification Improvements - Split button UI for identify/apply, bulk "Apply Suggested" for filling in names without calling the LLM, name sanitization to strip LLM commentary, and AUTO_IDENTIFY_RESPONSE_SCHEMA for local LLM compatibility.
Volume Controls - Volume slider popups and mute indicators on all audio and video players.

Changes

Speaker profiles preserved by default - Deleting all recordings for a speaker no longer auto-deletes their profile. Voice embeddings are aggregated and can't be reconstructed, so we keep them now. Set DELETE_ORPHANED_SPEAKERS=true if you want the old cleanup behavior.

Bug Fixes

ASR_DIARIZE=false was being ignored, diarization always ran
Bulk delete failed with integrity error on speaker snippets
File monitor stability check capped at 2s regardless of config
clean_llm_response was too aggressive, broke markdown formatting
Null transcription crash on certain processing paths
Safari iOS recording view not updating
Missing translations for folders, API tokens, recording recovery, events, and speakers

Compatibility

Backwards compatible with v0.8.x. Video retention and parallel uploads are opt-in via env vars. The speaker profile preservation change is the only default behavior change. See docs for DELETE_ORPHANED_SPEAKERS if you need the old behavior.

Thanks to ItsMly for the markdown fix (#232)

murtaza-nasir/speakr v0.8.11-alpha v0.8.11 - Video Retention, Parallel Uploads & Speaker Improvements on GitHub

Video Retention, Parallel Uploads & Speaker Improvements

New Features

Changes

Bug Fixes

Compatibility

murtaza-nasir/speakr v0.8.11-alpha
v0.8.11 - Video Retention, Parallel Uploads & Speaker Improvements

on GitHub