github Vexa-ai/vexa v0.10.3
v0.10.3 — post-incident stabilization

9 hours ago

v0.10.3 — Post-incident stabilization

Release tag: 0.10.0-260421-2337 · Compose: docker pull vexaai/meeting-api:0.10.0-260421-2337 · Helm: chart vexa-0.1.0 values global.imageTag=latest

Seven issue packs addressing GitHub reports + the 2026-04-20 incident-doc findings. Full release protocol run (groom → plan → develop → deploy → validate → triage → human → ship) with registry-gated regression guards.

Highlights

Area Change
🛡️ Chart secrets (#221) Every prod env (DB_PASSWORD, TRANSCRIPTION_SERVICE_TOKEN, etc.) now renders via secretKeyRef; required fail-loud at helm-template time when postgres is external and credentials secret is missing.
💾 Bot recording durability (#218) MediaRecorder switched to incremental 30-second chunk uploads (/internal/recordings/upload now takes chunk_seq). Each chunk lands in MinIO immediately at recordings/<user>/<id>/<session>/<audio|video>/NNNNNN.webm. Mid-meeting SIGKILL no longer loses the whole recording — pre-crash chunks survive. Recording.status stays IN_PROGRESS until explicit is_final=true.
🧩 Transcript dedup (#220) packages/transcript-rendering — containment branch now prefers the completed segment in both directions, eliminating stale italic drafts after confirmation. Package bumped 0.4.00.4.1; first CI matrix workflow (.github/workflows/test-packages.yml) added for every package push.
🏊 DB pool reset on return (#208) pool_reset_on_return="rollback" enforced on engine; new DB_POOL_NO_EXHAUSTION check + structural grep guard. Under stress, connections return to the pool cleanly (no idle in transaction growth).
🔄 Rolling update zero-overlap New vexa.deploymentStrategy helper → maxSurge: 0, maxUnavailable: 1 across subchart Deployments. apiGateway.replicaCount default 1→2 so api-gateway rolls one at a time with zero downtime.
🐷 PgBouncer as optional OSS subchart pgbouncer.enabled: false by default. Flip to true and every service's DB_HOST rewires to the pgbouncer Service via vexa.dbHostEffective. Monolithic template pattern matching postgres/redis/minio.
🪦 Durable bot-exit callbacks runtime-api idle_loop now sweeps state.list_pending_callbacks() and re-invokes delivery. Orphan-active meetings become impossible by construction — meeting-api downtime no longer strands bot exits in limbo.
🔒 Security — internal ports loopback (P0) lite + compose bind postgres, redis, admin-api, runtime-api, mcp, minio to 127.0.0.1 instead of 0.0.0.0. Follow-up to the 2026-04 lite ransomware attack; only user-facing surfaces (gateway + dashboard) remain public.

Round-1 + Round-2 human-eyeroll fixes

  • Bug B — Teams bot exit 137 (OOM): runtime-api meeting profile memory limit 1536Mi → 2560Mi (both profiles.yaml and chart values.yaml); pod terminationGracePeriodSeconds=60 matching stop() grace. Registry hardened with RUNTIME_API_MEETING_PROFILE_MEMORY_2560MI + HELM_CHART_RUNTIME_API_MEMORY_2560MI.
  • Bugs C+D — media_file storage-path collision: recordings.py now uses {session}/{type}/{chunk_seq:06d}.{format} so audio and video never overwrite each other; media_files materialized only on is_final=true.
  • Bug E — video default was True: POST /bots video field flipped to False. Audio-only is now the default for transcription-focused deployments; video opt-in via explicit video=true. New regression check BOT_VIDEO_DEFAULT_OFF.

Known deferred

  • #171 Teams teams.live.com/meet/<numeric> admission false-positive — pre-existing.
  • #226 Teams light-meetings/launch confirmation modal blocks Join — filed during this cycle, msteams/join.ts needs to dismiss the new Teams "Continue without audio or video" modal before clicking Join.
  • Harness "running-pod matches source" META check — new test class for the next groom (caught by hand this cycle when helm ran a stale image past the gate).
  • npm publish @vexaai/transcript-rendering@0.4.1 — pipeline is wired in make release-publish-packages, pending first-time npm auth on the release machine.

Regression registry additions

BOT_VIDEO_DEFAULT_OFF, RUNTIME_API_MEETING_PROFILE_MEMORY_2560MI, HELM_CHART_RUNTIME_API_MEMORY_2560MI, HELM_PROD_SECRETS_SECRETREF_ONLY, HELM_PROD_SECRETS_REQUIRED_AT_RENDER, RECORDING_UPLOAD_SUPPORTS_CHUNK_SEQ, BOT_RECORDS_INCREMENTALLY, RECORDING_SURVIVES_MID_MEETING_KILL, RUNTIME_API_STOP_GRACE_MATCHES_POD_SPEC, TRANSCRIPT_RENDERING_DEDUP_TESTS_PASS, PACKAGES_CI_WORKFLOW_EXISTS, ENGINE_POOL_RESET_ON_RETURN_ROLLBACK, HELM_ALL_SERVICES_DB_POOL_TUNED, DB_POOL_NO_EXHAUSTION, HELM_DEPLOYMENT_STRATEGY_HELPER_DEFINED, HELM_ROLLING_UPDATE_ZERO_SURGE, HELM_API_GATEWAY_REPLICA_COUNT_HA, HELM_PGBOUNCER_OPTIONAL_AND_WIRED, RUNTIME_API_IDLE_LOOP_SWEEPS_PENDING_CALLBACKS, RUNTIME_API_EXIT_CALLBACK_DURABLE, LITE_POSTGRES_NOT_PUBLIC, LITE_INTERNAL_SERVICES_LOOPBACK_ONLY, LITE_REDIS_NOT_PUBLIC, COMPOSE_PORTS_LOOPBACK_ONLY.


Full changelog: #227 · Validation report: tests3/reports/release-0.10.0-260421-2337.md

Don't miss a new vexa release

NewReleases is sending notifications on new releases.