A bundle of correctness fixes around AI-summary rendering, log streaming through the BFF, the PR-comment dispatcher, and — most importantly — the run lifecycle when a user cancels an in-flight apply.
Highlights
- Cancel-while-applying now honours reality —
cancel_runon anapplyingrun no longer transitions straight tocanceled. A newcancelingintermediate state lets the reconciler pick the terminal status from observable Job outcome: if a state-version was uploaded, the run isapplied(apply landed; bookkeeping must follow reality); if the Job was killed cleanly with no state, the run iscanceledand the workspace'sstate_divergedflag is set so the operator sees the drift-warning signal. Artifact uploads are deliberately never rejected by tracking status — refusing a state upload from a Job that mutated infra would be strictly worse than the zombie symptom. - Plan output no longer truncates at the tail —
upload_plan_log/upload_apply_lognow publish alog_updatedSSE event after writing storage, so the UI's log-fetcher comes back for the bytes the runner's EXIT trap landed (typically thePLAN_HAS_CHANGES=...line and post-plan OPA output). The PR #423 fix in v0.30.7 was the necessary half — kept the poll open by holding off ETX on the Redis path — but the UI only re-fetches on event, and there was no event when the final log landed. - No more duplicate PR comments on fast status flips —
_find_or_create_commentnow serialises across replicas via a Redis SETNX mutex keyed on(workspace_id, pr_number). Without it, aqueued → planningflip within the same scheduler tick produces two dedup-distinct triggers; both dispatchers found an empty cache + empty marker search and POSTed a brand-new comment, leaving a duplicate alongside the one being edited. - AI plan summary risk_factors no longer drop out — the schema enforces presence, but neither schema nor constrained-decoding can express "non-empty when risk_level isn't low". The model occasionally returned
risk_level: "medium"withrisk_factors: [], leaving the operator with a yellow pill and no enumerated reasons. The prompt rule is now a hard rule (not a schema-comment hint), and a parse-time observability log fires when the violation slips through anyway.
Status
Beta — production-deployable; the cancel-zombie change touches the run state machine, so re-test cancel paths against your live deployment before promoting.
Full Changelog: v0.30.7...v0.30.8