Scheduled triggers (crons) are now resilient — no single trigger can hold up the rest.
- A slow or stuck trigger fire (e.g. resuming a sandbox) no longer blocks the scheduler: each fire is time-bounded and isolated, so one trigger can fail and retry while every other trigger keeps firing on schedule.
- The scheduler self-heals — a stalled sweep is automatically reclaimed, so automations can't silently stop.
- Hardened background-worker leadership so an API-only node can never sit on the scheduler lease without running it.
- Added a stall watchdog + health signal so a frozen scheduler is surfaced immediately instead of going unnoticed.
What's Changed
- chore(release): VERSION → 0.9.69 [skip ci] by @github-actions[bot] in #3617
- fix(gates): advisory scanners shouldn't fail the quality gate by @lillyboga in #3619
- fix(db): migrate up auto-baselines an existing-schema env (unblocks prod deploy) by @markokraemer in #3618
- chore(dev-eks): deploy dev-f64735de [skip ci] by @github-actions[bot] in #3621
- feat(gateway): LLM gateway/router — BYOK, managed Bedrock models, native observability by @lillyboga in #3620
- chore(dev-eks): deploy dev-20067149 [skip ci] by @github-actions[bot] in #3623
- fix(api-image): resolve @kortix/shared/llm-catalog subpath in API image (dev crashloop hotfix) by @lillyboga in #3624
- chore(dev-eks): deploy dev-81937601 [skip ci] by @github-actions[bot] in #3627
- fix(triggers): make the cron scheduler unwedgeable (timeouts + self-heal + no dead-weight leader) by @markokraemer in #3626
- chore(dev-eks): deploy dev-d69a2ec9 [skip ci] by @github-actions[bot] in #3631
Full Changelog: v0.9.68...v0.9.69