v0.9
Release since v0.6 — 272 files changed, ~54k lines added.
New platforms
- Zoom SDK integration — native Zoom Meeting SDK bot with real-time transcription, speaker diarization, and C++ audio bridge (
services/vexa-bot/core/src/platforms/zoom/) - MS Teams full URL support — enterprise deep links, v2 fragment format (
#/meet/<id>), ZoomGov URLs, Meet nickname URLs
Interactive bots
New bot capabilities beyond passive transcription:
- Speak — TTS playback into meetings via new
tts-service(OpenAI-compatible/v1/audio/speech) - Chat — read/send meeting chat messages
- Screen share — push content into meetings
- Virtual camera & avatar — configurable default avatar, post-admission camera re-enablement
- Microphone — programmatic mic control
New API endpoints: /bots/{platform}/{id}/speak, /chat, /screen, /avatar
Recording pipeline
- Full recording storage with MinIO (S3-compatible) and local filesystem backends
- Incremental upload during meeting via WhisperLive durable spool
- Inline range streaming for playback
- Post-meeting aggregate transcription from recordings
- API: `GET /recordings`, `GET /recordings/{id}`, `GET /recordings/{id}/media/{id}/download`, `DELETE`
- Browser audio flush/persist on leave for both Google Meet and Teams
Deployment
- Vexa Lite — single-container all-in-one deployment with supervisord, process orchestrator, CPU transcription (`docker/lite/`)
- Process orchestrator — spawn bots as local Node.js processes for Lite mode (`ORCHESTRATOR=process`)
- Multi-arch Docker builds (amd64 + arm64)
Webhooks & hooks
- HMAC-SHA256 signed webhook delivery with exponential backoff retry
- Generic post-meeting hooks system (`POST_MEETING_HOOKS` env var) for billing/analytics integrations
- `transcribe_enabled` persisted in meeting data and included in webhook payload
Transcription improvements
- Standalone transcription service with faster-whisper, load balancing, quality testing framework (`services/transcription-service/`)
- WhisperLive: configurable settings via env vars (VAD thresholds, `SAME_OUTPUT_THRESHOLD`, concurrency limits)
- Remote transcriber mode for offloading to external GPU
- Bot optimization: disable incoming video tracks for transcription-only bots (~87% CPU savings)
MCP service
Expanded from ~200 to ~930 lines:
- Meeting URL parsing (Meet, Teams, Zoom)
- Recordings tools
- Bearer auth
- Transcript notes and prompts
API gateway
New proxied endpoints since v0.6: `/speak`, `/chat`, `/screen`, `/avatar`, `/recordings/*`, public transcript share links
Documentation
Full rewrite to Mintlify — API reference (bots, meetings, transcripts, recordings, settings), platform guides (Meet, Teams, Zoom), deployment guides (compose, Lite, Helm), webhooks, interactive bots, voice agent, recording storage, security, troubleshooting