Swama v1.3.0 Release Notes
π What's New
OpenAI-Compatible Audio API
/v1/audio/transcriptions
endpoint - Full OpenAI API compatibility for seamless integration- Multipart form data support - Proper file upload handling with audio format validation
- Multiple response formats - JSON, text, and verbose JSON output options
- Robust error handling - Comprehensive error responses with proper HTTP status codes
Enhanced CLI with Audio Commands
- New
transcribe
command - Comprehensive audio transcription with customizable options - Enhanced
pull
command - Unified downloading for both MLX and WhisperKit models - Rich CLI options - Support for model selection, language, temperature, prompts, and output formats
- Intelligent model validation - Automatic detection and validation of WhisperKit models
π Usage
Audio Transcription
# Basic transcription
swama transcribe audio.wav
# Specify model and language
swama transcribe audio.wav -m whisper-base -l en
# Get detailed output with timestamps
swama transcribe audio.wav --verbose
# JSON output for programmatic use
swama transcribe audio.wav -f json
# Fine-tune with temperature and prompt
swama transcribe audio.wav -t 0.2 -p "Technical discussion about AI"
WhisperKit Model Management
# Download WhisperKit models
swama pull whisper-tiny
swama pull whisper-base
swama pull whisper-small
swama pull whisper-large
# List all available models (includes WhisperKit)
swama list
Audio API Integration
# Transcribe via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/audio/transcriptions \
-F "file=@meeting.wav" \
-F "model=whisper-large" \
-F "language=en" \
-F "response_format=verbose_json"
# Simple text response
curl -X POST http://localhost:28100/v1/audio/transcriptions \
-F "file=@audio.wav" \
-F "model=whisper-large" \
-F "response_format=text"
π¦ Download
Available formats:
- DMG installer - Easy drag-and-drop installation for macOS
- ZIP archive - Direct application bundle
π Upgrade Notes
- If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Toolβ¦" to update the CLI tools
- New audio capabilities: The
transcribe
command and audio API are immediately available after upgrade - Model storage: WhisperKit models are stored in
~/.swama/models/whisperkit
to keep them organized separately from MLX models - API compatibility: Existing
/v1/chat/completions
and/v1/embeddings
endpoints continue to work unchanged
π§ Requirements
- macOS 14.0+
- Apple Silicon (M1/M2/M3/M4)
- For audio transcription: Compatible audio formats (WAV recommended, other formats auto-converted)
π― Key Benefits
- Privacy-focused: All audio processing happens locally on your device
- Cost-effective: No API calls to external services for transcription
- High performance: Optimized for Apple Silicon with intelligent memory management
- Developer-friendly: OpenAI-compatible API for easy integration into existing workflows
What's Changed
- Support Audio by @sxy-trans-n in #31
Full Changelog: v1.2.0...v1.3.0