github Vexa-ai/vexa v0.9

11 hours ago

v0.9

Release since v0.6 — 272 files changed, ~54k lines added.

New platforms

  • Zoom SDK integration — native Zoom Meeting SDK bot with real-time transcription, speaker diarization, and C++ audio bridge (services/vexa-bot/core/src/platforms/zoom/)
  • MS Teams full URL support — enterprise deep links, v2 fragment format (#/meet/<id>), ZoomGov URLs, Meet nickname URLs

Interactive bots

New bot capabilities beyond passive transcription:

  • Speak — TTS playback into meetings via new tts-service (OpenAI-compatible /v1/audio/speech)
  • Chat — read/send meeting chat messages
  • Screen share — push content into meetings
  • Virtual camera & avatar — configurable default avatar, post-admission camera re-enablement
  • Microphone — programmatic mic control

New API endpoints: /bots/{platform}/{id}/speak, /chat, /screen, /avatar

Recording pipeline

  • Full recording storage with MinIO (S3-compatible) and local filesystem backends
  • Incremental upload during meeting via WhisperLive durable spool
  • Inline range streaming for playback
  • Post-meeting aggregate transcription from recordings
  • API: `GET /recordings`, `GET /recordings/{id}`, `GET /recordings/{id}/media/{id}/download`, `DELETE`
  • Browser audio flush/persist on leave for both Google Meet and Teams

Deployment

  • Vexa Lite — single-container all-in-one deployment with supervisord, process orchestrator, CPU transcription (`docker/lite/`)
  • Process orchestrator — spawn bots as local Node.js processes for Lite mode (`ORCHESTRATOR=process`)
  • Multi-arch Docker builds (amd64 + arm64)

Webhooks & hooks

  • HMAC-SHA256 signed webhook delivery with exponential backoff retry
  • Generic post-meeting hooks system (`POST_MEETING_HOOKS` env var) for billing/analytics integrations
  • `transcribe_enabled` persisted in meeting data and included in webhook payload

Transcription improvements

  • Standalone transcription service with faster-whisper, load balancing, quality testing framework (`services/transcription-service/`)
  • WhisperLive: configurable settings via env vars (VAD thresholds, `SAME_OUTPUT_THRESHOLD`, concurrency limits)
  • Remote transcriber mode for offloading to external GPU
  • Bot optimization: disable incoming video tracks for transcription-only bots (~87% CPU savings)

MCP service

Expanded from ~200 to ~930 lines:

  • Meeting URL parsing (Meet, Teams, Zoom)
  • Recordings tools
  • Bearer auth
  • Transcript notes and prompts

API gateway

New proxied endpoints since v0.6: `/speak`, `/chat`, `/screen`, `/avatar`, `/recordings/*`, public transcript share links

Documentation

Full rewrite to Mintlify — API reference (bots, meetings, transcripts, recordings, settings), platform guides (Meet, Teams, Zoom), deployment guides (compose, Lite, Helm), webhooks, interactive bots, voice agent, recording storage, security, troubleshooting

Don't miss a new vexa release

NewReleases is sending notifications on new releases.