github Trans-N-ai/swama v1.3.0

latest releases: v1.4.3, v1.4.2, v1.4.1...
2 months ago

Swama v1.3.0 Release Notes

πŸ†• What's New

OpenAI-Compatible Audio API

  • /v1/audio/transcriptions endpoint - Full OpenAI API compatibility for seamless integration
  • Multipart form data support - Proper file upload handling with audio format validation
  • Multiple response formats - JSON, text, and verbose JSON output options
  • Robust error handling - Comprehensive error responses with proper HTTP status codes

Enhanced CLI with Audio Commands

  • New transcribe command - Comprehensive audio transcription with customizable options
  • Enhanced pull command - Unified downloading for both MLX and WhisperKit models
  • Rich CLI options - Support for model selection, language, temperature, prompts, and output formats
  • Intelligent model validation - Automatic detection and validation of WhisperKit models

πŸš€ Usage

Audio Transcription

# Basic transcription
swama transcribe audio.wav

# Specify model and language
swama transcribe audio.wav -m whisper-base -l en

# Get detailed output with timestamps
swama transcribe audio.wav --verbose

# JSON output for programmatic use
swama transcribe audio.wav -f json

# Fine-tune with temperature and prompt
swama transcribe audio.wav -t 0.2 -p "Technical discussion about AI"

WhisperKit Model Management

# Download WhisperKit models
swama pull whisper-tiny
swama pull whisper-base 
swama pull whisper-small
swama pull whisper-large

# List all available models (includes WhisperKit)
swama list

Audio API Integration

# Transcribe via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@meeting.wav" \
  -F "model=whisper-large" \
  -F "language=en" \
  -F "response_format=verbose_json"

# Simple text response
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=whisper-large" \
  -F "response_format=text"

πŸ“¦ Download

Download Swama v1.3.0

Available formats:

  • DMG installer - Easy drag-and-drop installation for macOS
  • ZIP archive - Direct application bundle

πŸ”„ Upgrade Notes

  • If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
  • New audio capabilities: The transcribe command and audio API are immediately available after upgrade
  • Model storage: WhisperKit models are stored in ~/.swama/models/whisperkit to keep them organized separately from MLX models
  • API compatibility: Existing /v1/chat/completions and /v1/embeddings endpoints continue to work unchanged

πŸ”§ Requirements

  • macOS 14.0+
  • Apple Silicon (M1/M2/M3/M4)
  • For audio transcription: Compatible audio formats (WAV recommended, other formats auto-converted)

🎯 Key Benefits

  • Privacy-focused: All audio processing happens locally on your device
  • Cost-effective: No API calls to external services for transcription
  • High performance: Optimized for Apple Silicon with intelligent memory management
  • Developer-friendly: OpenAI-compatible API for easy integration into existing workflows

What's Changed

Full Changelog: v1.2.0...v1.3.0

Don't miss a new swama release

NewReleases is sending notifications on new releases.