This release introduces a modular connector-based architecture for transcription and a full-featured REST API.
Connector Architecture
- Modular Transcription - New connector-based system with auto-detection
- Simplified Configuration - Fewer env vars; auto-detects from
ASR_BASE_URLorTRANSCRIPTION_MODEL - OpenAI Diarization - Use
gpt-4o-transcribe-diarizefor speaker identification without self-hosting - Data-Driven UI - Features automatically appear based on connector capabilities
- Connector-Aware Chunking - Chunking handled internally by connectors that support it
Available Connectors:
| Connector | Use Case |
|---|---|
asr_endpoint
| Self-hosted WhisperX/Whisper ASR services |
openai_transcribe
| OpenAI gpt-4o-transcribe models (with diarization option) |
openai_whisper
| Legacy Whisper API (whisper-1) |
Deprecated Variables:
USE_ASR_ENDPOINT=true→ Just setASR_BASE_URLinsteadWHISPER_MODEL→ UseTRANSCRIPTION_MODELinstead
REST API v1
- Complete API - Full CRUD for recordings, tags, speakers, processing
- Swagger UI - Interactive docs at
/api/v1/docs - Stats Endpoint - Dashboard-compatible for gethomepage.dev
- Batch Operations - Bulk update, delete, transcribe
- Chat & Events API - Programmatic AI chat and calendar event access
- Audio Download - Stream or download audio files
Documentation
- Migration Guide - Update your configuration
- API Reference - Complete endpoint documentation
- Updated env examples in
config/env.transcription.example
Compatibility
Fully backwards compatible. Existing configurations continue to work with deprecation warnings in logs.