github cloudflare/agents @cloudflare/think@0.11.0

latest releases: @cloudflare/voice@0.3.3, @cloudflare/shell@0.4.1
6 hours ago

Minor Changes

  • #1758 6b46b04 Thanks @threepointone! - Add progress signalling and durable milestones for agent-tool sub-agents
    (#1758, rfc-detached-agent-tools §progress, phases 4a + 4b).

    A sub-agent running as an agent tool (awaited or detached/background) can now
    report mid-run progress:

    // Inside the child sub-agent (e.g. from a tool's execute):
    await this.reportProgress({
      fraction: 0.6,
      phase: "deploying",
      message: "Generating menu page…"
    });

    These signals ride the child's own turn stream as a transient
    data-agent-progress part, so they re-broadcast to the parent's connected
    clients and surface on AgentToolRunState.progress via useAgentToolEvents — a
    background-runs tray can render a live bar / phase / status line without drilling
    in. Highlights:

    • reportProgress({ fraction?, message?, phase?, data? }, { persist? }) on
      chat agents (@cloudflare/think, AIChatAgent); a no-op with a dev warning on
      the base Agent and when called outside an active agent-tool run. The framework
      resolves the run id from the active turn — no threading required. Bursts are
      coalesced (latest-wins; a fraction >= 1 "done" frame always flushes). data
      is live-only unless { persist: true }.
    • onProgress(run, progress) parent hook, fired best-effort from the tail
      for both awaited and detached runs.
    • Latest-snapshot persistence + recovery inspect. The child stores a
      progress_json + last_signal_at on its run row and surfaces it through
      inspectAgentToolRun().progress, so a rehydrated parent reconstructs progress
      after eviction.
    • Resetting no-progress budget for detached runs. Once a detached child has
      reported at least one signal, the backbone gives up if it then goes silent for
      detachedNoProgressBudgetMs (default 1h; per-run override via
      detached: { noProgressBudgetMs }), surfaced as interrupted with the
      no-progress reason. A child that never reports is bounded only by the absolute
      detachedMaxBudgetMs ceiling — we never give up on a run merely for being slow.

    Durable milestones (phase 4b)

    Naming a milestone promotes a signal from the ephemeral tier to a durable
    one — there is still only one emit method:

    // Inside the child sub-agent:
    await this.reportProgress({
      milestone: "sources-gathered",
      data: { sources: 2 }
    });
    • Persisted + replayable. Each milestone is one row on the child
      (cf_agent_tool_milestones / cf_ai_chat_agent_tool_milestones) with a
      monotonic per-run sequence. It rides the stream as a persisted
      data-agent-milestone part (vs. transient progress), so drill-in replay and a
      rehydrated parent both see it. Surfaced via inspectAgentToolRun().milestones
      and AgentToolRunState.milestones (deduped by sequence).

    • onProgress fires for milestones too — the snapshot carries
      progress.milestone, so a consumer can branch on milestone vs. ephemeral.

    • detached: { onMilestones } chat convenience (@cloudflare/think and
      AIChatAgent). When a configured milestone lands, the chat agent surfaces an
      idempotent synthetic chat message (keyed/idempotent per (runId, name))
      before the run finishes. Delivered from both the warm tail and the cold
      backbone reconcile; the deterministic id collapses them to at-most-once. Two
      modes (the string[] shorthand defaults to "narrate"):

      • "narrate" (default) — a synthetic assistant message injected directly
        (no inference): a cheap, honest status line that does not trigger a turn.
      • "react" — a user-role turn so the model responds to the milestone
        (steer, start dependent work). Costs a model turn.
      detached: { onMilestones: ["preview-ready"] } // narrate (default)
      detached: { onMilestones: { names: ["needs-approval"], mode: "react" } }

      Override the wording via formatDetachedMilestone(run, milestone). These
      synthetic messages carry metadata.source so clients can render them as an
      agent event rather than a human turn (the example does this).

    The awaitable join point (awaitAgentToolMilestone, phase 4c) is intentionally
    not included here — it is gated behind a design addendum.

  • #1758 6b46b04 Thanks @threepointone! - Add detached: { notify: true } support for runAgentTool on chat agents
    (@cloudflare/think and AIChatAgent) (#1752).

    When a detached sub-agent run finishes, a chat agent can inject a message back
    into the chat so the model reacts to the result — without you wiring onFinish
    by hand:

    await this.runAgentTool(ResearchAgent, {
      input,
      detached: { notify: { source: "research-background" } }
    });

    The injected turn is idempotent per run + terminal status, so an exactly-once
    finish never duplicates, while a soft give-up followed by a real late completion
    surfaces as two distinct turns. (Think dedupes via a submitMessages
    idempotency key; AIChatAgent, which has no durable-submission layer, persists
    under a deterministic message id and runs the follow-up turn inline within the
    already-serialized delivery slot.) Use notify: true for the default
    metadata.source, pass notify: { source } to match your app's message
    taxonomy, and override formatDetachedCompletion(run, result) to customize (or
    suppress) the injected text.

  • #1817 7f367d8 Thanks @threepointone! - create-think now prompts for a starter template when --template is omitted (and falls back to basic when stdin is non-interactive). npm create think and think init initialize a git repository — skipping cleanly when the target is already inside one — and scaffold projects with Oxlint/Oxfmt config plus a check script. Removes the unused declarative agent() framework helper and the identity helpers (defineMessengers, defineScheduledTasks, defineChannels) in favor of class-based agents and typed object returns.

  • #1790 190ea81 Thanks @threepointone! - Add ctx.attachReply(attachment) for actions: an advisory, recording-only reply-attachment side-channel surfaced on ChatResponseResult.attachments (in onChatResponse) and a public replyAttachments(requestId?) getter. Attachments are JSON-normalized, deep-copied on read, capped per turn, and never alter the model-visible tool output; policy callbacks are no-ops, failed executions discard their attachments, approval-gated approved actions support it, and durable-pause approved actions are a v1 no-op.

  • #1790 190ea81 Thanks @threepointone! - Add a pending-retry lease for the action ledger via the new actionLedgerPendingRetryLeaseMs config (default 5 minutes). A pending ledger row left behind by a crashed executor is now reclaimed and re-run once it is stale, but ONLY for actions that declare an explicit idempotencyKey — the key is the developer's assertion that re-running the keyed side effect is safe. Behavior change: such a stale row previously blocked forever with ActionPendingError; it now reclaims (refreshing updated_at in place, still pending), emits action:ledger:reclaimed, and re-runs execute. Fresh rows, fallback tool:${toolCallId} keys, and a disabled lease (actionLedgerPendingRetryLeaseMs = false) keep the conservative ActionPendingError behavior. Same-isolate coalescing still wins first, so an in-flight run is never reclaimed.

  • #1790 190ea81 Thanks @threepointone! - Add a durable action ledger for action() descriptors so settled server action outputs can be replayed by stable idempotency key without re-running side effects.

  • #1790 190ea81 Thanks @threepointone! - Generalize the messenger runtime into a public channel surface. Add configureChannels() and ChannelDefinition (web, voice, messenger, and custom channels) wrapping getMessengers(), a no-turn deliverNotice() with informModel, additive DeliveryTag (kind + turnEnded) on messenger snapshots, per-channel policy (instructions, tool-narrowing, maxTurns) applied as overridable defaults, turn-scoped channel context threaded through runTurn (persisted for recovery), reply-attachment rendering at delivery, and channel:*/notice:* observability events.

  • #1790 190ea81 Thanks @threepointone! - Add durable-pause approval descriptors: durable-pause actions now park in a dedicated cf_think_action_pending_approvals store and resume via approveExecution/rejectExecution with a connection-independent continuation, so a turn can be approved from a dashboard with no live socket (this also fixes codemode approveExecution from a dashboard). A unified ActionApprovalDescriptor is attached to durable-pause, codemode, and approval-gated parts, pendingApprovals() lists all pending approvals for cold-load reconciliation, and an overridable describePausedExecution() hook enriches codemode descriptors.

  • #1801 c58b401 Thanks @threepointone! - Add @cloudflare/think/react, a Think-tuned useAgentChat wrapper that keeps setMessages local-only by default while reusing the shared chat React implementation.

  • #1788 3b2af54 Thanks @threepointone! - Think now annotates and logs row-size compaction the same way
    @cloudflare/ai-chat does.

    When a persisted message exceeds the SQLite row-size limit and Think compacts
    its tool outputs or truncates its text parts to fit, the resulting message now
    carries metadata.compactedToolOutputs (the compacted tool-call IDs) and/or
    metadata.compactedTextParts (the truncated text-part indices), and Think
    emits a console.warn describing the compaction. The compaction itself is
    unchanged — Think already used the shared shape-preserving truncateToolOutput
    compactor — this only adds the previously ai-chat-only annotations/warnings so a
    client can tell that a stored message was compacted. Both packages now share one
    enforceRowSizeLimit implementation.

  • #1790 190ea81 Thanks @threepointone! - Add public runTurn(options) facade (Turns RFC step 2): unified turn admission
    with mode: "wait" | "submit" | "stream" delegating to the existing
    saveMessages, continueLastTurn, submitMessages, and chat methods.
    Exports TurnInputMessages, RunTurnWait, RunTurnSubmit, RunTurnStream,
    RunTurnOptions, and TurnResult.

  • #1799 3c2afc9 Thanks @threepointone! - Allow runTurn({ mode: "stream" }) to accept array and function inputs, matching the existing wait mode input surface while preserving the durable submit function-input guard.

Patch Changes

  • #1788 3b2af54 Thanks @threepointone! - Converge recovery forward-progress crediting between AIChatAgent and Think.

    Both hosts now credit the recovery no-progress counter through one shared, host-agnostic rule (shouldCreditStreamProgress): a progress milestone (a started text/reasoning segment or a settled tool input/output) credits unconditionally, and mid-segment streaming deltas (text-delta/reasoning-delta/tool-input-delta) credit at most once per throttle window via a per-isolate StreamProgressCreditThrottle. Previously AIChatAgent credited only on chunk-type milestones while Think credited on its flush cadence, so a long single content segment spanning repeated crashes could read as "no progress" under AIChatAgent and false-fire its no_progress_timeout. The new rule is never coarser than either host's prior cadence, so it can only delay or avoid a false no-progress timeout, never hasten give-up.

  • #1803 c476265 Thanks @threepointone! - Fix AI SDK status getting stuck after a reconnect that races a turn's
    pre-stream window (#1784).

    A turn is "accepted but pre-stream" while it is queued, debouncing, or awaiting
    async setup before its resumable stream starts. A client that connected or sent
    a STREAM_RESUME_REQUEST in that window was answered with STREAM_RESUME_NONE
    ("nothing to resume"), so its short resume probe resolved null and AI SDK
    status settled on ready even though the server went on to stream — leaving
    the UI unable to render the in-flight turn until a full remount.

    This adds a shared PreStreamTurns tracker (agents/chat) and a new
    server→client cf_agent_stream_pending frame:

    • The resume handshake now parks resume requests that arrive during the
      pre-stream window and emits STREAM_PENDING ("keep waiting") instead of
      STREAM_RESUME_NONE, then flushes parked connections into the normal
      STREAM_RESUMING handshake once the stream actually starts (and releases them
      with STREAM_RESUME_NONE if the turn is superseded/cleared before streaming).
    • On STREAM_PENDING the client transport extends its resume probe from the
      5s fast-path to a 60s backstop so the probe stays open across the gap.
    • useAgentChat re-probes the stream on a transparent socket reopen (e.g. a
      1006 reconnect that does not remount the component) so status recovers.
    • Continuation affinity is relaxed via an optional isConnectionPresent host
      hook so a transparent reconnect (whose connection id changed) can resume a
      continuation whose original owner connection is gone.

    Wired into both AIChatAgent and @cloudflare/think.

    The pre-stream tracker is in-memory only; it is hibernation-safe because a turn
    in its pre-stream window is an unresolved message-handler promise that pins the
    Durable Object in memory, so eviction only happens once a stream is durably
    recorded (and resumes via ResumableStream) or the turn has finished. Skipped
    turns (supersede/generation change) settle without releasing parked
    connections, so a client parked during the window survives onto the successor
    turn instead of being cut loose by a premature STREAM_RESUME_NONE.

  • #1788 3b2af54 Thanks @threepointone! - Recovery give-up now resolves the orphaned stream by newest metadata row.

    The stable-timeout/error give-up path that terminalizes an exhausted recovery
    turn previously resolved the turn's orphaned stream id with an in-memory
    first-match scan over all stream metadata, while the wake (restart) path already
    used the newest durable row keyed by the recovery-root request id. These two
    lookups are now a single seam, so both paths surface the same partial — the
    newest stream the turn produced — when a request id spans more than one
    recovery attempt. Single-attempt turns (one stream row per request id) are
    unaffected.

  • #1794 b6ad4d5 Thanks @threepointone! - Extract transcript repair into a shared agents/chat primitive.

    @cloudflare/think's _repairToolTranscriptParts — which flips an interrupted
    tool call (a tool-* / dynamic-tool part with no settled result, left behind
    when a stream was cut off mid-flight) into an errored tool-result so the next
    provider call doesn't 400 with AI_MissingToolResultsError, and normalizes
    malformed tool input — now lives once as the shared, @internal
    repairInterruptedToolParts primitive (plus the toolPartHasSettledResult
    terminal-state check) in agents/chat.

    The primitive is pure (returns a new messages array plus repair stats; never
    touches storage, broadcast, or events) and is parameterized by an overridable
    repairPart hook plus an optional shouldRepair(part) skip predicate (defaults
    to repairing every interrupted part), so both AI-SDK chat hosts can run repair
    logic before re-entering inference on a recovered turn — a host whose default
    errors the part (ai-chat) uses shouldRepair to leave a part still awaiting a
    client interaction verbatim. @cloudflare/think delegates through its existing
    repairInterruptedToolPart hook with no shouldRepair (repairs everything) — a
    pure internal refactor with no observable behavior or API change; its suites pass
    unchanged.

  • #1772 d4f27fe Thanks @mattzcarey! - Include each package's documentation in its published package.

  • #1790 190ea81 Thanks @threepointone! - Add stable approval descriptors for Think actions and preserve approval descriptor metadata on chat tool parts.

  • #1790 190ea81 Thanks @threepointone! - Harden action approval and authorization edge cases around approved inputs and continuation rechecks.

  • #1790 190ea81 Thanks @threepointone! - Add action permission metadata and default-full-grant authorization hooks for Think actions.

  • #1790 190ea81 Thanks @threepointone! - Add the action() descriptor and getActions() hook for compiling guarded
    server actions into Think tools.

  • #1790 190ea81 Thanks @threepointone! - Harden Think action descriptors with schema-inferred inputs and JSON-safe output
    normalization.

  • #1790 190ea81 Thanks @threepointone! - Route Think turn entry points through a shared internal _admitTurn spine and
    throw a clear error for nested blocking turn admissions that previously could
    deadlock.

  • #1797 f599892 Thanks @threepointone! - Fix: a recovered agent-tool child turn now re-binds its run row to the
    recovery turn's request id, so a healthy long-running child is no longer
    abandoned as interrupted after a deploy.

    When a facet running as an agent-tool child was interrupted mid-run (e.g. a
    deploy evicted it), its recovery continuation (continueLastTurn /
    _retryLastUserTurn) minted a fresh request id but left
    cf_agent_tool_child_runs.request_id pointing at the pre-eviction turn. Frame
    attribution (_agentToolRunForRequest) then failed, so the recovered turn's
    broadcast frames never reached the parent's re-attach tail; the parent saw no
    forward progress and sealed a still-advancing child interrupted once its
    no-progress budget elapsed. The recovery paths now re-bind the child-run row
    (and the in-memory attribution map) to the current turn's request id, keeping
    frames flowing across recovery so the parent re-attaches and follows the child
    to its real terminal.

  • #1797 f599892 Thanks @threepointone! - Fix: a recovered pre-stream retry turn now re-applies per-channel policy.

    continueLastTurn already re-resolved the channel from the persisted user
    message (metadata.channel) so a recovered partial turn re-applied its
    channel's instructions / tool narrowing. The pre-stream retry path
    (_retryLastUserTurn, used by _chatRecoveryRetry) admitted the recovered turn
    without re-resolving the channel, so an interrupted-before-streaming turn was
    retried with the default policy instead of the channel's — even though the
    metadata.channel stamp survived. It now re-resolves and re-applies the channel
    on both recovery paths, matching the documented invariant.

  • #1790 190ea81 Thanks @threepointone! - Add chat:turn:start and chat:turn:finish observability events for Think
    turn execution.

  • #1790 190ea81 Thanks @threepointone! - Think.waitUntilStable() now waits out an armed-but-unfired auto-continuation
    before reporting stable, converging onto @cloudflare/ai-chat.

    Previously, when a turn ended with no pending human/client interaction,
    waitUntilStable() reported stable immediately — even if an auto-continuation
    was armed (its ~50ms coalesce timer still pending, or its completeness drain in
    flight). In that window idle eviction or chat recovery could act on a transcript
    that was about to be continued. Think now mirrors @cloudflare/ai-chat: while
    a continuation is armed (pending && !pastCoalesce and the shared
    AutoContinuationController reports armed), waitUntilStable() reports
    not-stable and waits out the coalesce window, then re-checks (the continuation
    either fires and enqueues a turn the loop drains, or parks and clears, at which
    point the agent is genuinely stable).

  • Updated dependencies [7f367d8]:

    • create-think@0.1.1

Don't miss a new agents release

NewReleases is sending notifications on new releases.