github murtaza-nasir/speakr v0.5.7-alpha
v0.5.7 - GPT5 support, Custom Summarization Prompts & More

one day ago

Release Notes - v0.5.7

Major Features

GPT-5 Model Support

  • Automatic detection: Speakr automatically recognizes GPT-5 models and adjusts API parameters accordingly
  • Advanced parameters: Support for reasoning_effort (minimal/low/medium/high) and verbosity (low/medium/high)
  • Model variants: Full support for gpt-5, gpt-5-mini, gpt-5-nano, and gpt-5-chat-latest
  • Flexible configuration: Environment variables for customizing reasoning depth and output detail
  • Seamless migration: Existing GPT-4 configurations continue working without changes
  • Provider detection: Automatically uses appropriate parameters based on API endpoint
  • Updated SDK: OpenAI Python library upgraded to v2.2.0 for GPT-5 compatibility

Custom Prompt Selection in Summary Reprocessing

  • Three prompt options: Choose between default, tag-based prompts, or custom one-time prompts
  • Tag library access: Use prompts from any saved tag without modifying the recording's tags
  • Experimentation workflow: Test different summarization approaches without permanent changes
  • Freeform custom prompts: Enter specific instructions for unique summary perspectives
  • Non-destructive: Only regenerates summary, preserving original transcription

Improvements

Progressive Web App (PWA)

  • Service worker: Offline support with asset caching for faster load times
  • Wake Lock API: Prevents screen from sleeping during recording sessions (desktop and mobile)
  • Mobile notifications: Shows recording status in notification tray (mobile only)
  • Mobile warnings: Clear UI warnings about keeping app visible during mobile recording

User Experience

  • Navigation warnings: Prevent accidental data loss when leaving pages with unsaved recordings
  • Mobile recording UX: Warning banner displayed during recording on mobile devices

Configuration

  • Example files updated: Both env.whisper.example and env.asr.example include GPT-5 settings
  • Clear parameter documentation: Inline comments explain each GPT-5 option
  • Sensible defaults: Medium reasoning and verbosity provide balanced performance

Technical Improvements

API Layer

  • Intelligent parameter handling: Conditional logic for GPT-5 vs standard model parameters
  • Provider detection: Automatic identification of OpenAI vs OpenRouter endpoints
  • Parameter validation: Proper error handling for unsupported parameter combinations
  • Logging enhancements: Clear visibility into which parameters are being applied

Backend Processing

  • Custom prompt override: New parameter in summary generation allows temporary prompt replacement
  • Prompt hierarchy preservation: Custom prompts correctly override tag/user/admin defaults

Frontend Integration

  • Modal enhancements: Clean UI for selecting reprocessing options
  • Tag filtering: Only show tags with custom prompts in selection dropdown
  • Prompt preview: Display tag prompt excerpts for informed selection
  • Radio button grouping: Clear visual organization of prompt source options

Bug Fixes

  • Fixed i18n translations for ASR speaker labels not appearing correctly in recording view
  • Fixed navigation warnings not appearing when leaving pages with active recordings

Configuration Changes

New Environment Variables

# GPT-5 specific parameters (optional)
GPT5_REASONING_EFFORT=medium  # minimal, low, medium, high
GPT5_VERBOSITY=medium          # low, medium, high

Updated Dependencies

  • openai>=2.2.0 (upgraded from 1.3.0)

API Changes

Enhanced Functions

  • call_llm_completion(): Now detects GPT-5 models and adjusts parameters
  • generate_summary_only_task(): Accepts optional custom_prompt_override parameter
  • reprocess_summary(): Passes custom prompt from frontend to backend

New Helper Functions

  • is_gpt5_model(): Detects GPT-5 model names
  • is_using_openai_api(): Identifies OpenAI API endpoints

Migration Notes

Upgrading to v0.5.7

If using pre-built images, just pull the latest image. If building locally, do the following:

For GPT-5 Users:

  1. Update dependencies: pip install -r requirements.txt
  2. Set TEXT_MODEL_BASE_URL to https://api.openai.com/v1
  3. Set TEXT_MODEL_NAME to gpt-5, gpt-5-mini, or gpt-5-nano
  4. Optionally configure GPT5_REASONING_EFFORT and GPT5_VERBOSITY
  5. Restart Speakr and check logs for confirmation

For Existing Installations:

  • No configuration changes required
  • Existing models (GPT-4, Claude, etc.) continue working unchanged

Breaking Changes

None - this release is fully backward compatible.

Documentation Updates

  • New: docs/user-guide/pwa.md - Comprehensive PWA guide with mobile recording limitations
  • New: docs/admin-guide/model-configuration.md - Comprehensive model setup guide
  • Enhanced: docs/user-guide/transcripts.md - Added custom prompt reprocessing section
  • Enhanced: docs/user-guide/recording.md - Mobile recording best practices and limitations
  • Updated: docs/getting-started.md - Mentions GPT-5 configuration
  • Updated: docs/features.md - References advanced model support
  • Updated: docs/index.md - Latest updates section reflects v0.5.7
  • Updated: docs/user-guide/index.md - Added PWA documentation card
  • Updated: docs/admin-guide/index.md - Added model configuration card
  • Updated: config/env.whisper.example - GPT-5 parameters and documentation
  • Updated: config/env.asr.example - GPT-5 parameters and documentation
  • Updated: README.md - Version and PWA feature highlights

Don't miss a new speakr release

NewReleases is sending notifications on new releases.