v0.10.3 — Post-incident stabilization
Release tag: 0.10.0-260421-2337 · Compose: docker pull vexaai/meeting-api:0.10.0-260421-2337 · Helm: chart vexa-0.1.0 values global.imageTag=latest
Seven issue packs addressing GitHub reports + the 2026-04-20 incident-doc findings. Full release protocol run (groom → plan → develop → deploy → validate → triage → human → ship) with registry-gated regression guards.
Highlights
| Area | Change |
|---|---|
| 🛡️ Chart secrets (#221) | Every prod env (DB_PASSWORD, TRANSCRIPTION_SERVICE_TOKEN, etc.) now renders via secretKeyRef; required fail-loud at helm-template time when postgres is external and credentials secret is missing.
|
| 💾 Bot recording durability (#218) | MediaRecorder switched to incremental 30-second chunk uploads (/internal/recordings/upload now takes chunk_seq). Each chunk lands in MinIO immediately at recordings/<user>/<id>/<session>/<audio|video>/NNNNNN.webm. Mid-meeting SIGKILL no longer loses the whole recording — pre-crash chunks survive. Recording.status stays IN_PROGRESS until explicit is_final=true.
|
| 🧩 Transcript dedup (#220) | packages/transcript-rendering — containment branch now prefers the completed segment in both directions, eliminating stale italic drafts after confirmation. Package bumped 0.4.0 → 0.4.1; first CI matrix workflow (.github/workflows/test-packages.yml) added for every package push.
|
| 🏊 DB pool reset on return (#208) | pool_reset_on_return="rollback" enforced on engine; new DB_POOL_NO_EXHAUSTION check + structural grep guard. Under stress, connections return to the pool cleanly (no idle in transaction growth).
|
| 🔄 Rolling update zero-overlap | New vexa.deploymentStrategy helper → maxSurge: 0, maxUnavailable: 1 across subchart Deployments. apiGateway.replicaCount default 1→2 so api-gateway rolls one at a time with zero downtime.
|
| 🐷 PgBouncer as optional OSS subchart | pgbouncer.enabled: false by default. Flip to true and every service's DB_HOST rewires to the pgbouncer Service via vexa.dbHostEffective. Monolithic template pattern matching postgres/redis/minio.
|
| 🪦 Durable bot-exit callbacks | runtime-api idle_loop now sweeps state.list_pending_callbacks() and re-invokes delivery. Orphan-active meetings become impossible by construction — meeting-api downtime no longer strands bot exits in limbo.
|
| 🔒 Security — internal ports loopback (P0) | lite + compose bind postgres, redis, admin-api, runtime-api, mcp, minio to 127.0.0.1 instead of 0.0.0.0. Follow-up to the 2026-04 lite ransomware attack; only user-facing surfaces (gateway + dashboard) remain public.
|
Round-1 + Round-2 human-eyeroll fixes
- Bug B — Teams bot exit 137 (OOM):
runtime-apimeeting profile memory limit 1536Mi → 2560Mi (bothprofiles.yamland chartvalues.yaml); podterminationGracePeriodSeconds=60matchingstop()grace. Registry hardened withRUNTIME_API_MEETING_PROFILE_MEMORY_2560MI+HELM_CHART_RUNTIME_API_MEMORY_2560MI. - Bugs C+D — media_file storage-path collision:
recordings.pynow uses{session}/{type}/{chunk_seq:06d}.{format}so audio and video never overwrite each other;media_filesmaterialized only onis_final=true. - Bug E — video default was
True:POST /botsvideofield flipped toFalse. Audio-only is now the default for transcription-focused deployments; video opt-in via explicitvideo=true. New regression checkBOT_VIDEO_DEFAULT_OFF.
Known deferred
- #171 Teams
teams.live.com/meet/<numeric>admission false-positive — pre-existing. - #226 Teams
light-meetings/launchconfirmation modal blocks Join — filed during this cycle,msteams/join.tsneeds to dismiss the new Teams "Continue without audio or video" modal before clicking Join. - Harness "running-pod matches source" META check — new test class for the next groom (caught by hand this cycle when helm ran a stale image past the gate).
npm publish @vexaai/transcript-rendering@0.4.1— pipeline is wired inmake release-publish-packages, pending first-time npm auth on the release machine.
Regression registry additions
BOT_VIDEO_DEFAULT_OFF, RUNTIME_API_MEETING_PROFILE_MEMORY_2560MI, HELM_CHART_RUNTIME_API_MEMORY_2560MI, HELM_PROD_SECRETS_SECRETREF_ONLY, HELM_PROD_SECRETS_REQUIRED_AT_RENDER, RECORDING_UPLOAD_SUPPORTS_CHUNK_SEQ, BOT_RECORDS_INCREMENTALLY, RECORDING_SURVIVES_MID_MEETING_KILL, RUNTIME_API_STOP_GRACE_MATCHES_POD_SPEC, TRANSCRIPT_RENDERING_DEDUP_TESTS_PASS, PACKAGES_CI_WORKFLOW_EXISTS, ENGINE_POOL_RESET_ON_RETURN_ROLLBACK, HELM_ALL_SERVICES_DB_POOL_TUNED, DB_POOL_NO_EXHAUSTION, HELM_DEPLOYMENT_STRATEGY_HELPER_DEFINED, HELM_ROLLING_UPDATE_ZERO_SURGE, HELM_API_GATEWAY_REPLICA_COUNT_HA, HELM_PGBOUNCER_OPTIONAL_AND_WIRED, RUNTIME_API_IDLE_LOOP_SWEEPS_PENDING_CALLBACKS, RUNTIME_API_EXIT_CALLBACK_DURABLE, LITE_POSTGRES_NOT_PUBLIC, LITE_INTERNAL_SERVICES_LOOPBACK_ONLY, LITE_REDIS_NOT_PUBLIC, COMPOSE_PORTS_LOOPBACK_ONLY.
Full changelog: #227 · Validation report: tests3/reports/release-0.10.0-260421-2337.md