Video Retention, Parallel Uploads & Speaker Improvements
Big feature release. Video playback, faster uploads, duplicate detection, and a bunch of speaker identification improvements.
New Features
-
Video Retention - Set
VIDEO_RETENTION=trueto keep video files for in-browser playback instead of extracting audio and throwing away the video. Great for lectures, presentations, and screen recordings. Audio gets extracted to a temp file for transcription and cleaned up after. Seeking works properly via HTTP Range requests. There's a collapsible toggle in the player controls so you can hide the video when you just want the transcript. -
Parallel Uploads - Files now upload concurrently instead of one at a time. Makes batch uploads way faster. Control how many go at once with
MAX_CONCURRENT_UPLOADS(default 3). -
Duplicate Detection - SHA-256 hashing on upload catches duplicate files. You get a warning toast and a clickable indicator in the sidebar showing existing copies.
-
Speaker API Endpoints - REST API for speaker identification and assignment with bearer token auth. Extracted the identification logic into a shared service so the API and UI use the same code.
-
Speaker Identification Improvements - Split button UI for identify/apply, bulk "Apply Suggested" for filling in names without calling the LLM, name sanitization to strip LLM commentary, and
AUTO_IDENTIFY_RESPONSE_SCHEMAfor local LLM compatibility. -
Volume Controls - Volume slider popups and mute indicators on all audio and video players.
Changes
- Speaker profiles preserved by default - Deleting all recordings for a speaker no longer auto-deletes their profile. Voice embeddings are aggregated and can't be reconstructed, so we keep them now. Set
DELETE_ORPHANED_SPEAKERS=trueif you want the old cleanup behavior.
Bug Fixes
ASR_DIARIZE=falsewas being ignored, diarization always ran- Bulk delete failed with integrity error on speaker snippets
- File monitor stability check capped at 2s regardless of config
clean_llm_responsewas too aggressive, broke markdown formatting- Null transcription crash on certain processing paths
- Safari iOS recording view not updating
- Missing translations for folders, API tokens, recording recovery, events, and speakers
Compatibility
Backwards compatible with v0.8.x. Video retention and parallel uploads are opt-in via env vars. The speaker profile preservation change is the only default behavior change. See docs for DELETE_ORPHANED_SPEAKERS if you need the old behavior.