github murtaza-nasir/speakr v0.5.0-alpha
v0.5.0 - Tagging system introduced

latest release: v0.5.1-alpha
26 days ago

Release Notes - v0.5.0

New Features

Advanced Tagging System

  • Multi-tag support: Assign multiple tags to recordings with priority ordering
  • Custom prompts per tag: Each tag can have its own summarization prompt that automatically applies
  • ASR defaults per tag: Configure default language and speaker settings for each tag
  • Tag-based search: Filter recordings by tags using the search bar (e.g., tag:meeting)
  • Visual tag management: Beautiful tag display with numbered priorities and easy removal

Enhanced ASR (Automatic Speech Recognition) Integration

  • Advanced options UI: Configure language and speaker detection directly from upload screen
  • Automatic speaker detection: No more hardcoded speaker limits - let ASR auto-detect participants
  • Improved diarization: Better speaker identification with automatic participant extraction
  • Simplified configuration: Just set USE_ASR_ENDPOINT=true and the system handles the rest

Document Export

  • Word document generation: Export summaries and notes as properly formatted .docx files
  • Server-side generation: Reliable document creation using python-docx
  • Rich formatting: Headers, bold text, lists, and metadata preserved in exports

Improvements

User Interface

  • Improved styling: Better theme consistency for chat bubbles and recording cards
  • Reorganized layouts: Recording cards now show title → date → participants → tags
  • Icon enhancements: Added visual icons for dates and participants
  • Compact design: More efficient use of screen space with appropriate spacing
  • Upload progress management: Clear completed uploads with one click

Audio Processing

  • Configurable chunk sizes: Respects CHUNK_SIZE_MB environment variable properly
  • Better chunk calculation: Improved algorithm reaches target sizes more accurately (95% accuracy)
  • AAC file handling: Special handling for problematic AAC-encoded files
  • MIME type detection: Enhanced support for M4A, AAC, and other audio formats

Summary Generation

  • Better markdown formatting: Improved prompts for consistent bold key-value pairs
  • Multiple prompt sources: Combines tag prompts with user prompts in priority order
  • Debugging support: Added logging for troubleshooting empty summaries

Bug Fixes

  • Fixed NameError when using local LLM setups without API keys
  • Fixed UnboundLocalError when uploading files without tags
  • Fixed ASR parameter hierarchy (user input > tag defaults > env vars > auto-detect)
  • Fixed database migration issues with 'order' column quoting
  • Fixed 'No file provided' error on page refresh
  • Fixed duplicate tag display in management interfaces
  • Fixed full-screen drop zone conflicts causing double uploads
  • Fixed code block styling for proper dark/light mode contrast

Configuration

New Environment Variables

  • LOG_LEVEL: Control logging verbosity (ERROR/INFO/DEBUG)
  • CHUNK_SIZE_MB: Configure audio chunk size for processing (default: 25MB)

Simplified ASR Setup

  • Automatic diarization when ASR endpoint is enabled
  • Removed redundant configuration options
  • Cleaner environment file examples

Documentation

  • Improved environment file examples with clear comments
  • Added debugging guidance for summary generation issues

Breaking Changes

  • None - all changes are backward compatible

Migration Notes

  • Database will automatically migrate to support new tagging system
  • Existing recordings will work without tags
  • ASR users should review simplified configuration in env.asr.example

Coming Next

  • Bulk tag operations
  • Tag templates for common workflows

Don't miss a new speakr release

NewReleases is sending notifications on new releases.