Span tree redesign. The plugin now emits spans that match the actual structure of an OpenClaw agent run instead of collapsing every generation + tool into a single fake llm_request. Existing dashboards keyed on gen_ai.* attribute names still work — span names changed, attribute namespaces didn't.
Changed (breaking, in trace shape)
-
One
agentspan per run, withmodel_call/tool_call/compaction/subagentchildren. The old shape had a singleinteraction(renamed fromagent) with onellm_requestcovering the whole attempt and tool spans as siblings. That was wrong on two counts:llm_input/llm_outputfire ONCE per attempt (not per generation), and an attempt is a sequence of generations interleaved with tool executions. The new shape:agent (root, traceId = hash(runId)) ├─ compaction (0..1, rare; budget-triggered) ├─ model_call (1..N, one per provider API call) ├─ tool_call: foo (between model_calls; sibling of agent) ├─ model_call ├─ tool_call: bar ├─ subagent (0..N — nested child agent spans land underneath) └─ model_callTool spans are siblings of
agent, not children ofmodel_call, because tools run between generations — not during them. -
Span names follow OpenClaw's events, not OTel semantic-convention terms.
agent/model_call/tool_call/compaction/subagent. Attribute namespaces staygen_ai.*andopenclaw.*. -
Per-call attributes on
model_callspans. Each generation gets its own duration, outcome, error category, upstream request id hash, time-to-first-byte, request payload bytes, response stream bytes — straight frommodel_call_started/model_call_endedpayloads. -
Per-call input messages on
model_callspans via the snapshot trick.gen_ai.input.messageson eachmodel_callreflects what THAT generation actually saw — the rolling history evolves across the run asbefore_tool_callappends synthetic assistanttool_callparts andafter_tool_callappendstoolresponses. Per-call output messages and per-call usage aren't surfaced by OpenClaw today (they're attempt-aggregate); those stay on theagentspan only, with a README pointer to the upstream feature request. -
Subagent spans nest the child's full agent tree underneath via cross-runId trace propagation. When
subagent_spawnedfires we register aMap<childRunId, parentTraceId+subagentSpanId>link. The child'sbefore_agent_startconsults the map, uses the parent'straceId, and parents the child agent under the parent'ssubagentspan. Same trace, one waterfall across the spawn tree.
Added
- New typed-hook subscriptions:
before_agent_start,model_call_started,model_call_ended,before_compaction,after_compaction,subagent_spawned,subagent_ended. - Privacy gating implemented via
:gatedattribute key suffix. The OTLP encoder strips any attribute whose key ends in:gatedwhenallowConversationAccess === false— uniform mechanism, no per-key conditional. Gated attributes:gen_ai.input.messages,gen_ai.output.messages,gen_ai.system_instructions,user_prompt,gen_ai.tool.call.arguments,gen_ai.tool.call.result,before_compaction.messages,before_agent_start.{prompt,messages},agent_end.messages, andopenclaw.error.message(error strings can leak prompt/response content). - Abandoned-span handling: any
model_call/tool_call/compaction/subagentopen atagent_endis force-closed withoutcome: "abandoned"so trace gaps don't appear when an attempt errors mid-flight.
Removed
src/turn-builder.ts+src/turn-builder.test.ts— replaced bysrc/span-builder.ts/src/span-builder.test.ts. The state machine is fundamentally different (multiple open spans per run instead of a singleRunRecordwithLlmCallRecord[]).- Old
interactionandllm_requestspan names — folded intoagentand split per-generation asmodel_call. orphanToolslogic — no longer needed once tools are paired with proper before/after events. Still-open tools at agent_end now go through the abandoned-span path.
Notes for operators
- Existing OpenClaw versions (≥ 2026.4.25) ship
model_call_started/model_call_endedalready; the minimum-version requirement is unchanged. - Codex/Claude-Code style backends will still show one
model_callper attempt because their internal generations aren't surfaced to OpenClaw's selection layer. README now documents this. Filing the OpenClaw-side enhancement (per-call usage + assistantText onmodel_call_ended) is upstream and out of scope here. - The
before_tool_callhook is arunModifyingHook. Plugin handler returnsundefined(so OpenClaw dispatches the tool normally). New regression test verifies the return isundefined.