jamiepine/voicebox v0.3.0 on GitHub

This release rewrites the backend into a modular architecture, overhauls the settings UI into routed sub-pages, fixes audio player freezing, migrates documentation to Fumadocs, and ships a batch of bug fixes targeting the most-reported issues from the tracker.

The backend's 3,000-line monolith main.py has been decomposed into domain routers, a services layer, and a proper database package. A style guide and ruff configuration now enforce consistency. On the frontend, settings have been split into dedicated routed pages with server logs, a changelog viewer, and an about page. The audio player no longer freezes mid-playback, and model loading status is now visible in the UI. Seven user-reported bugs have been fixed, including server crashes during sample uploads, generation list staleness, cryptic error messages, and CUDA support for RTX 50-series GPUs.

Settings Overhaul (#294)

Split settings into routed sub-tabs: General, Generation, GPU, Logs, Changelog, About
Added live server log viewer with auto-scroll
Added in-app changelog page that parses CHANGELOG.md at build time
Added About page with version info, license, and generation folder quick-open
Extracted reusable SettingRow component for consistent setting layouts

Audio Player Fix (#293)

Fixed audio player freezing during playback
Improved playback UX with better state management and listener cleanup
Fixed restart race condition during regeneration
Added stable keys for audio element re-rendering
Improved accessibility across player controls

Backend Refactor (#285)

Extracted all routes from main.py into 13 domain routers under backend/routes/ — main.py dropped from ~3,100 lines to ~10
Moved CRUD and service modules into backend/services/, platform detection into backend/utils/
Split monolithic database.py into a database/ package with separate models, session, migrations, and seed modules
Added backend/STYLE_GUIDE.md and pyproject.toml with ruff linting config
Removed dead code: unused _get_cuda_dll_excludes, stale studio.py, example_usage.py, old Makefile
Deduplicated shared logic across TTS backends into backends/base.py
Improved startup logging with version, platform, data directory, and database stats
Fixed startup database session leak — sessions now rollback and close in finally block
Isolated shutdown unload calls so one backend failure doesn't block the others
Handled null duration in story_items migration
Reject model migration when target is a subdirectory of source cache

Documentation Rewrite (#288)

Migrated docs site from Mintlify to Fumadocs (Next.js-based)
Rewrote introduction and root page with content from README
Added "Edit on GitHub" links and last-updated timestamps on all pages
Generated OpenAPI spec and auto-generated API reference pages
Removed stale planning docs (CUDA_BACKEND_SWAP, EXTERNAL_PROVIDERS, MLX_AUDIO, TTS_PROVIDER_ARCHITECTURE, etc.)
Sidebar groups now expand by default; root redirects to /docs
Added OG image metadata and /og preview page

UI & Frontend

Added model loading status indicator and effects preset dropdown (3187344)
Fixed take-label race condition during regeneration
Added accessible focus styling to select component
Softened select focus indicator opacity
Addressed 4 critical and 12 major issues from CodeRabbit review

Bug Fixes (#295)

Fixed sample uploads crashing the server — audio decoding now runs in a thread pool instead of blocking the async event loop (#278)
Fixed generation list not updating when a generation completes — switched to refetchQueries for reliable cache busting, added SSE error fallback, and page reset on completion (#231)
Fixed error toasts showing [object Object] instead of the actual error message (#290)
Added Whisper model selection (base, small, medium, large, turbo) and expanded language support to the /transcribe endpoint (#233)
Upgraded CUDA backend build from cu121 to cu126 for RTX 50-series (Blackwell) GPU support (#289)
Handled client disconnects in SSE and streaming endpoints to suppress [Errno 32] Broken Pipe errors (#248)
Fixed Docker build failure from pip hash mismatch on Qwen3-TTS dependencies (#286)
Added 50 MB upload size limit with chunked reads to prevent unbounded memory allocation on sample uploads
Eliminated redundant double audio decode in sample processing pipeline

Platform Fixes

Replaced netstat with TcpStream + PowerShell for Windows port detection (#277)
Fixed Docker frontend build and cleaned up Docker docs
Fixed macOS download links to use .dmg instead of .app.tar.gz
Added dynamic download redirect routes to landing site

Release Tooling

Added draft-release-notes and release-bump agent skills
Wired CI release workflow to extract notes from CHANGELOG.md for GitHub Releases
Backfilled changelog with all historical releases