Swama v1.3.0 Release Notes

🆕 What's New

OpenAI-Compatible Audio API

/v1/audio/transcriptions endpoint - Full OpenAI API compatibility for seamless integration
Multipart form data support - Proper file upload handling with audio format validation
Multiple response formats - JSON, text, and verbose JSON output options
Robust error handling - Comprehensive error responses with proper HTTP status codes

Enhanced CLI with Audio Commands

New transcribe command - Comprehensive audio transcription with customizable options
Enhanced pull command - Unified downloading for both MLX and WhisperKit models
Rich CLI options - Support for model selection, language, temperature, prompts, and output formats
Intelligent model validation - Automatic detection and validation of WhisperKit models

🚀 Usage

Audio Transcription

# Basic transcription
swama transcribe audio.wav

# Specify model and language
swama transcribe audio.wav -m whisper-base -l en

# Get detailed output with timestamps
swama transcribe audio.wav --verbose

# JSON output for programmatic use
swama transcribe audio.wav -f json

# Fine-tune with temperature and prompt
swama transcribe audio.wav -t 0.2 -p "Technical discussion about AI"

WhisperKit Model Management

# Download WhisperKit models
swama pull whisper-tiny
swama pull whisper-base 
swama pull whisper-small
swama pull whisper-large

# List all available models (includes WhisperKit)
swama list

Audio API Integration

# Transcribe via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@meeting.wav" \
  -F "model=whisper-large" \
  -F "language=en" \
  -F "response_format=verbose_json"

# Simple text response
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=whisper-large" \
  -F "response_format=text"

📦 Download

Download Swama v1.3.0

Available formats:

DMG installer - Easy drag-and-drop installation for macOS
ZIP archive - Direct application bundle

🔄 Upgrade Notes

If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
New audio capabilities: The transcribe command and audio API are immediately available after upgrade
Model storage: WhisperKit models are stored in ~/.swama/models/whisperkit to keep them organized separately from MLX models
API compatibility: Existing /v1/chat/completions and /v1/embeddings endpoints continue to work unchanged

🔧 Requirements

macOS 14.0+
Apple Silicon (M1/M2/M3/M4)
For audio transcription: Compatible audio formats (WAV recommended, other formats auto-converted)

🎯 Key Benefits

Privacy-focused: All audio processing happens locally on your device
Cost-effective: No API calls to external services for transcription
High performance: Optimized for Apple Silicon with intelligent memory management
Developer-friendly: OpenAI-compatible API for easy integration into existing workflows

What's Changed

Support Audio by @sxy-trans-n in #31

Full Changelog: v1.2.0...v1.3.0

Trans-N-ai/swama v1.3.0 on GitHub

Swama v1.3.0 Release Notes

🆕 What's New

OpenAI-Compatible Audio API

Enhanced CLI with Audio Commands

🚀 Usage

Audio Transcription

WhisperKit Model Management

Audio API Integration

📦 Download

🔄 Upgrade Notes

🔧 Requirements

🎯 Key Benefits

What's Changed

Trans-N-ai/swama v1.3.0
on GitHub