Minor Changes
-
#1758
6b46b04Thanks @threepointone! - Add progress signalling and durable milestones for agent-tool sub-agents
(#1758, rfc-detached-agent-tools §progress, phases 4a + 4b).A sub-agent running as an agent tool (awaited or detached/background) can now
report mid-run progress:// Inside the child sub-agent (e.g. from a tool's execute): await this.reportProgress({ fraction: 0.6, phase: "deploying", message: "Generating menu page…" });
These signals ride the child's own turn stream as a transient
data-agent-progresspart, so they re-broadcast to the parent's connected
clients and surface onAgentToolRunState.progressviauseAgentToolEvents— a
background-runs tray can render a live bar / phase / status line without drilling
in. Highlights:reportProgress({ fraction?, message?, phase?, data? }, { persist? })on
chat agents (@cloudflare/think,AIChatAgent); a no-op with a dev warning on
the baseAgentand when called outside an active agent-tool run. The framework
resolves the run id from the active turn — no threading required. Bursts are
coalesced (latest-wins; afraction >= 1"done" frame always flushes).data
is live-only unless{ persist: true }.onProgress(run, progress)parent hook, fired best-effort from the tail
for both awaited and detached runs.- Latest-snapshot persistence + recovery inspect. The child stores a
progress_json+last_signal_aton its run row and surfaces it through
inspectAgentToolRun().progress, so a rehydrated parent reconstructs progress
after eviction. - Resetting no-progress budget for detached runs. Once a detached child has
reported at least one signal, the backbone gives up if it then goes silent for
detachedNoProgressBudgetMs(default 1h; per-run override via
detached: { noProgressBudgetMs }), surfaced asinterruptedwith the
no-progressreason. A child that never reports is bounded only by the absolute
detachedMaxBudgetMsceiling — we never give up on a run merely for being slow.
Durable milestones (phase 4b)
Naming a
milestonepromotes a signal from the ephemeral tier to a durable
one — there is still only one emit method:// Inside the child sub-agent: await this.reportProgress({ milestone: "sources-gathered", data: { sources: 2 } });
-
Persisted + replayable. Each milestone is one row on the child
(cf_agent_tool_milestones/cf_ai_chat_agent_tool_milestones) with a
monotonic per-runsequence. It rides the stream as a persisted
data-agent-milestonepart (vs. transient progress), so drill-in replay and a
rehydrated parent both see it. Surfaced viainspectAgentToolRun().milestones
andAgentToolRunState.milestones(deduped bysequence). -
onProgressfires for milestones too — the snapshot carries
progress.milestone, so a consumer can branch on milestone vs. ephemeral. -
detached: { onMilestones }chat convenience (@cloudflare/thinkand
AIChatAgent). When a configured milestone lands, the chat agent surfaces an
idempotent synthetic chat message (keyed/idempotent per(runId, name))
before the run finishes. Delivered from both the warm tail and the cold
backbone reconcile; the deterministic id collapses them to at-most-once. Two
modes (thestring[]shorthand defaults to"narrate"):"narrate"(default) — a synthetic assistant message injected directly
(no inference): a cheap, honest status line that does not trigger a turn."react"— a user-role turn so the model responds to the milestone
(steer, start dependent work). Costs a model turn.
detached: { onMilestones: ["preview-ready"] } // narrate (default) detached: { onMilestones: { names: ["needs-approval"], mode: "react" } }
Override the wording via
formatDetachedMilestone(run, milestone). These
synthetic messages carrymetadata.sourceso clients can render them as an
agent event rather than a human turn (the example does this).
The awaitable join point (
awaitAgentToolMilestone, phase 4c) is intentionally
not included here — it is gated behind a design addendum. -
#1799
3c2afc9Thanks @threepointone! - Stop reconnecting on terminal WebSocket close events and expose terminal connection failures viaconnectionError/onConnectionErroronAgentClient,useAgent, anduseAgentChat. -
#1794
b6ad4d5Thanks @threepointone! - Recover interrupted server-tool calls on resume instead of abandoning them.When a turn is interrupted mid tool call (e.g. a server tool whose
execute()
died with an evicted isolate, leaving aninput-availableorphan that nothing
will ever resolve),AIChatAgentnow repairs the transcript before re-entering
inference on the recovered turn — the same behavior@cloudflare/thinkalready
has. The interrupted tool part is flipped to an errored tool-result through the
sharedagents/chatrepair primitive, so the nextconvertToModelMessagesno
longer 400s withAI_MissingToolResultsErrorand the turn continues.Adds an overridable
repairInterruptedToolPart(part)hook (default: flip to an
output-errorresult) so apps can customize the repaired shape for
client-resolved tools (e.g. preserve an interrupted question tool as text).
Repair only ever reshapes assistant tool parts; the corrected transcript is
persisted and broadcast through the normal write path.Repair runs before EVERY inference chokepoint — live submit, tool
auto-continuation,continueLastTurn,saveMessages/retry, and the chat
recovery callbacks — mirroring how@cloudflare/thinkrepairs before every
inference (the app ownsconvertToModelMessages, so the framework repairs
this.messagesright before handing control toonChatMessage). This closes
the cases a recovery-only repair missed: a mixed client+server orphan whose
client replay drives an auto-continuation, and any agent running with
chatRecoverydisabled. Repair is scoped per-part to dead SERVER orphans: a
part still legitimately awaiting a client (aninput-availableclient tool or an
approval-requestedpart the user may still answer) is left verbatim, so a fresh
dead-server orphan at the leaf is repaired even when an unrelated abandoned client
orphan sits earlier in history. It is a no-op (no write, no broadcast) for a
healthy transcript.The recovery-path stability wait (
waitUntilStable) now gates on the narrower
client-resolvable predicate so a dead server-tool orphan no longer blocks
stability — it is repaired and the turn continues.waitUntilStablegains an
optionalpendingInteractionpredicate; its default (and the documented
semantics for app overrides) is unchanged. -
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow uses an event-driven auto-continuation barrier that parks
indefinitely on an incomplete parallel tool batch instead of force-continuing
after a fixed timeout.Previously, when a turn ended with several parallel client tool calls and only
some results had arrived,AIChatAgentran the completeness barrier inside
the continuation turn and polled for up to 60s
(AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS), after which it continued
inference against whatever results had landed — potentially a half-complete tool
batch. The barrier is now event-driven and runs before the continuation is
enqueued (converging onto@cloudflare/think's model): it fires only once every
result in the batch has arrived, re-arms as each sibling result is applied and
when a streaming turn finalizes, guards against double-fire, and is gated on no
active stream. There is no orphan timeout — a batch with a never-arriving
sibling now parks budget-free until it completes (the same way a turn already
parks on a pending HITL/client interaction) rather than force-continuing with
missing results.This is a behavior change for the rare stuck-tool case: a result that never
arrives no longer triggers a continuation after 60s; it parks until the missing
result lands (or a later user turn / chat recovery repairs the transcript). A
parked continuation leaves the same on-disk signature as a HITL park, so a
deploy/crash mid-park recovers by re-arming rather than terminalizing. -
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow replays the live "recovering…" status on connect (#1620).Previously the
cf_agent_chat_recoveringframe was only broadcast live, so a
client that connected (or reconnected) while a durable turn was mid-recovery —
between a scheduled continuation and its first chunk — saw nothing and appeared
frozen until the turn resumed or failed. It now receives the recovering status
directly on connect (when no stream is active to resume), souseAgentChat's
isRecoveringreflects the in-progress recovery immediately. This converges
AIChatAgentonto@cloudflare/think's behavior. The status is still cleared on
completion, exhaustion, or any terminal outcome, and stale records (older than
the recovering-flag TTL) are skipped so a recovery abandoned without a terminal
cannot show "recovering…" forever. -
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow compacts oversized tool outputs structurally instead of
replacing them with a flat summary string.Previously, when a persisted assistant message exceeded the SQLite row-size
limit,AIChatAgentreplaced each large tool output with a single english
summary string ("This tool output was too large to persist… Preview: …"),
discarding the original shape. It now uses the shared shape-preserving
truncateToolOutputcompactor (the same one@cloudflare/thinkalready used):
objects and arrays keep their structure, long strings are truncated in place
with a... [truncated N chars]marker, and only genuinely unrepresentable
nesting collapses to a marker object. This makes a compacted tool result far
easier for the model to keep reasoning about, and convergesAIChatAgentand
@cloudflare/thinkonto one row-size compaction path. The
metadata.compactedToolOutputs/metadata.compactedTextPartsannotations and
the compactionconsole.warns are unchanged. -
#1788
3b2af54Thanks @threepointone! -AIChatAgentcan now detect and recover from a hung model/transport stream via
the opt-inchatStreamStallTimeoutMswatchdog (#1626).Set
chatStreamStallTimeoutMs(a class field, likechatRecovery) to the
maximum number of milliseconds allowed between stream chunks. If a turn parks
longer than that — a hung provider or a stalled transport — the watchdog aborts
the live stream instead of leaving the turn spinning forever. WhenchatRecovery
is enabled, the stall is routed into the same bounded-recovery machinery a
deploy/eviction interruption uses: the partial generated so far is persisted and
a continuation is scheduled (or, once the recovery budget is spent, the
configured terminal message is delivered). WithchatRecoverydisabled, a stall
surfaces as a terminal stream error so the spinner is cleared.The default is
0, which disables the watchdog (no behavior change unless you
opt in), matching@cloudflare/think. Because the watchdog measures the gap
between chunks — not total turn duration — a steadily streaming turn never trips
it regardless of overall length. Internally this is built on the shared
iterateWithStallWatchdogprimitive both@cloudflare/ai-chatand
@cloudflare/thinkconsume (an internalagents/chatseam, not a public API),
so this change ships under the@cloudflare/ai-chatbump alone. -
#1758
6b46b04Thanks @threepointone! - Adddetached: { notify: true }support forrunAgentToolon chat agents
(@cloudflare/thinkandAIChatAgent) (#1752).When a detached sub-agent run finishes, a chat agent can inject a message back
into the chat so the model reacts to the result — without you wiringonFinish
by hand:await this.runAgentTool(ResearchAgent, { input, detached: { notify: { source: "research-background" } } });
The injected turn is idempotent per run + terminal status, so an exactly-once
finish never duplicates, while a soft give-up followed by a real late completion
surfaces as two distinct turns. (Think dedupes via asubmitMessages
idempotency key;AIChatAgent, which has no durable-submission layer, persists
under a deterministic message id and runs the follow-up turn inline within the
already-serialized delivery slot.) Usenotify: truefor the default
metadata.source, passnotify: { source }to match your app's message
taxonomy, and overrideformatDetachedCompletion(run, result)to customize (or
suppress) the injected text.
Patch Changes
-
#1788
3b2af54Thanks @threepointone! -AIChatAgentnow delivers the terminal banner before persisting the durable
terminal record when chat recovery gives up, converging onto
@cloudflare/think's broadcast-first ordering.Previously
_exhaustChatRecoverypersisted the durable terminal record first
and broadcast the banner second. A terminal-record write can reject in the
deploy/storage window a give-up runs in (#1730); under persist-first the throw
propagated before the banner was sent, so the live banner was dropped on that
pass and only delivered on the healthy re-run (potentially a different isolate,
after the affected connections had gone). Broadcasting first makes the banner
resilient to a failing storage write: the throw still propagates and the whole
give-up re-runs on a healthy isolate, which persists the record idempotently and
re-delivers the banner (the documented at-least-once edge). Persisting first
gained no durability — the re-run persists either way — while losing this banner
resilience, so both chat hosts now terminalize broadcast-first. -
#1801
c58b401Thanks @threepointone! - Refactor@cloudflare/ai-chat/reactto re-export the shared implementation fromagents/chat/reactwhile preserving existing behavior and exports. -
#1797
f599892Thanks @threepointone! - Fix: a recovered agent-tool child turn now re-binds its run row to the
recovery turn's request id (parity with@cloudflare/think).When an
AIChatAgentfacet running as an agent-tool child was interrupted
mid-run, its recovery continuation (continueLastTurn/_retryLastUserTurn)
minted a fresh request id but leftcf_ai_chat_agent_tool_runs.request_id
pointing at the pre-eviction turn, breaking frame attribution. A long-running
recovered child then forwarded nothing to the parent's re-attach tail and could
be abandoned asinterruptedonce the no-progress budget elapsed. The recovery
paths now re-bind the child-run row (and the in-memory attribution map) so frames
keep flowing across recovery. -
#1802
391b034Thanks @threepointone! - Ensure tool approval updates always retain a provider-facing approval id.Older or hand-seeded transcripts can contain an
approval-requestedtool part
without anapproval.id. When that part is approved and auto-continuation
re-enters inference, the AI SDK requires a matching approval id in the converted
model messages. Approval updates now synthesize a stable id from the
toolCallIdwhen the transcript is missing one, preventing invalid prompt
errors while preserving existing approval metadata.@cloudflare/ai-chatnow
routes its approval merge through the sharedtoolApprovalUpdatebuilder so it
benefits from the same fallback instead of its own divergent copy. -
#1788
3b2af54Thanks @threepointone! - Converge recovery forward-progress crediting betweenAIChatAgentandThink.Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (
shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolateStreamProgressCreditThrottle. PreviouslyAIChatAgentcredited only on chunk-type milestones whileThinkcredited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" underAIChatAgentand false-fire itsno_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up. -
#1803
c476265Thanks @threepointone! - Fix AI SDKstatusgetting stuck after a reconnect that races a turn's
pre-stream window (#1784).A turn is "accepted but pre-stream" while it is queued, debouncing, or awaiting
async setup before its resumable stream starts. A client that connected or sent
aSTREAM_RESUME_REQUESTin that window was answered withSTREAM_RESUME_NONE
("nothing to resume"), so its short resume probe resolvednulland AI SDK
statussettled onreadyeven though the server went on to stream — leaving
the UI unable to render the in-flight turn until a full remount.This adds a shared
PreStreamTurnstracker (agents/chat) and a new
server→clientcf_agent_stream_pendingframe:- The resume handshake now parks resume requests that arrive during the
pre-stream window and emitsSTREAM_PENDING("keep waiting") instead of
STREAM_RESUME_NONE, then flushes parked connections into the normal
STREAM_RESUMINGhandshake once the stream actually starts (and releases them
withSTREAM_RESUME_NONEif the turn is superseded/cleared before streaming). - On
STREAM_PENDINGthe client transport extends its resume probe from the
5s fast-path to a 60s backstop so the probe stays open across the gap. useAgentChatre-probes the stream on a transparent socket reopen (e.g. a
1006 reconnect that does not remount the component) sostatusrecovers.- Continuation affinity is relaxed via an optional
isConnectionPresenthost
hook so a transparent reconnect (whose connection id changed) can resume a
continuation whose original owner connection is gone.
Wired into both
AIChatAgentand@cloudflare/think.The pre-stream tracker is in-memory only; it is hibernation-safe because a turn
in its pre-stream window is an unresolved message-handler promise that pins the
Durable Object in memory, so eviction only happens once a stream is durably
recorded (and resumes viaResumableStream) or the turn has finished. Skipped
turns (supersede/generation change) settle without releasing parked
connections, so a client parked during the window survives onto the successor
turn instead of being cut loose by a prematureSTREAM_RESUME_NONE. - The resume handshake now parks resume requests that arrive during the
-
#1788
3b2af54Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.The stable-timeout/error give-up path that terminalizes an exhausted recovery
turn previously resolved the turn's orphaned stream id with an in-memory
first-match scan over all stream metadata, while the wake (restart) path already
used the newest durable row keyed by the recovery-root request id. These two
lookups are now a single seam, so both paths surface the same partial — the
newest stream the turn produced — when a request id spans more than one
recovery attempt. Single-attempt turns (one stream row per request id) are
unaffected.