@oh-my-pi/pi-agent-core
Added
- Added
generateHandoffFromContext(context, model, options)to@oh-my-pi/pi-agent-core/compaction: runs the handoff oneshot against a fully-built providerContext(system prompt, normalized tools, transformed history, trailing handoff prompt) withstreamOptionsmirroring the live turn's cache routing, so a host that owns the transform pipeline can make the handoff request share the prompt cache the main turn populated.generateHandoff(messages, …)is unchanged and now delegates to it. - Added an optional
systemPromptargument toAgent.buildSideRequestContext(llmMessages, systemPrompt?), defaulting to the live agent prompt; callers can pin a different prompt (e.g. handoff generation, which uses the base prompt rather than a per-turnbefore_agent_starthook override).
Changed
- Updated
buildSideRequestContextto allow pinning custom system prompts
@oh-my-pi/pi-ai
Fixed
- Fixed Anthropic-compatible thinking requests sending replayed thinking blocks without
context_management.keep: "all", preserving multi-turn reasoning context for API-key providers. API-key requests now also advertise the requiredcontext-management-2025-06-27beta header so the field is honored instead of rejected. Injected SDK clients, GitHub Copilot's Anthropic proxy, and Vertex rawPredict are excluded because this code path cannot add the beta to caller-owned clients, Copilot strips Anthropic betas and demotes thinking blocks to text upstream, and Vertex expects betas in the JSON body rather than the Anthropic HTTP beta header. (#3288) - Fixed OpenRouter Responses native history replay leaking Gemini reasoning item
formatmetadata back into follow-up requests, which caused HTTP 400 rejections while preserving encrypted reasoning replay.
@oh-my-pi/pi-coding-agent
Breaking Changes
- Renamed the eval
agent()helper parametersagent_type→agentandreturn_handle→handleacross every workflow runtime (Python, JavaScript, Ruby, Julia), so the names are identical in every language (no camelCase/snake_case split) and the agent-selection parameter matches thetasktool'sagent. The__agent__eval bridge wire protocol was renamed to match. - Changed the
evaltool to take a single cell per call ({ language, code, title?, timeout?, reset? }) instead of acellsarray. State still persists per language across separate eval calls, tool calls, andtasksubagents, so each call is one logical step that reuses everything earlier calls defined — the array only encouraged re-importing/re-declaring the same setup in every batch. The schema, field descriptions, examples, systemeval.md/workflowzhelper docs, and the[i/n]cell-counter (now hidden for single cells) were updated to match; the renderer, ACP start-text, copy-targets, and collab-web tool view still parse legacy multi-cell transcripts.
Added
- Added
isolated,apply, andmergeoptions to evalagent()across every workflow runtime (Python, JavaScript, Ruby, Julia) soworkflowz-driven fan-outs can request the same copy-on-write worktree isolation thetasktool offers (strict opt-in viaisolated: true, matching thetasktool;apply: falsekeeps captured patches/branches without merging back;merge: falseforces patch mode). Extracted the task-isolation lifecycle intotask/isolation-runner.tsso the eval bridge andTaskToolshare one implementation (#3196)
Changed
- Made the session picker fullscreen with mouse support for clicking rows and scrolling
- Pinned the session picker footer to the bottom of the screen to prevent layout flickering
- Simplified
evaltool to accept a single logical step (code block) instead of an array of cells - Updated
evaltool documentation to emphasize incremental, single-step execution - Restricted
bashtool from usinglsorfind, requiring the use ofreadorfindtools - Simplified
todotool interface to accept a single operation directly instead of an array of ops - Reinforced routing of fragile, multi-step shell logic to the
evaltool overbash. The system-prompt tool policy,bash.md, andeval.mdnow treat loops, conditionals, heredocs, inline-e/-cscripts, multi-stage pipelines, and quote/JSON escaping as the signal to write anevalcell; bash's "compute a fact" carveout is narrowed to single short pipelines, andeval.mdnow actively claims that territory with runtime-templated examples (only enabled backends are advertised). - Made
evalan essential built-in tool (loadMode: "essential", added to the default essential tool set) so it stays active undertools.discoveryMode: "all"instead of being hidden behindsearch_tool_bm25. - Made the
--resumesession picker fullscreen on the terminal's alternate screen, so the list scrolls with the mouse wheel and a row resumes its session on left click. Rows are hit-tested against the live scroll window, and the keybinding hint + bottom border are now pinned to the screen bottom instead of drifting up and down as the visible window changes height.
Fixed
- Fixed
local://URLs decoding images as corrupted text (mojibake) instead of showing the image - Fixed
omp --resumehanging instead of exiting when the startup session picker is cancelled (Esc) or there are no sessions to resume. Startup arms long-lived handles (theme/appearance listeners, settings save timer, model registry), so the cancel/empty paths' barereturnleft the event loop alive and the process stuck after the picker cleared the alternate screen. These paths now exit cleanly viaprocess.exit(0), matching the--version/--exportearly-exit convention. The in-session/resumepicker is unaffected — it keeps its own cancel handler that just closes the overlay. - Fixed the
/resumesession picker scrolling down after a session is deleted. The delete-confirmation dialog was mounted as a sibling below the picker's bottom border, briefly growing the picker past the terminal height; the TUI committed the picker's header rows into native scrollback to fit, and when the dialog closedwindowTopstayed pinned at the new commit boundary, leaving the header stranded above the viewport. The picker now hosts theSessionListin a single content slot and swaps the dialog INTO that slot (replacing theSessionList) while it is open, so the dialog only competes with theSessionList's rendered budget — not theSessionListAND the picker chrome — and the picker frame stays inside the viewport. (#3283) - Fixed the
evaltool card not streaming a still-running cell's stdout: a long-running cell (e.g. atime.sleep()monitor loop) showed nothing until it returned or was interrupted, then dumped everything at once. The renderer draws cell output fromdetails.cells[i].output, which was only populated afterbackend.execute()resolved — live stdout streamed into the transient resultcontenttail (andrenderContext.output), which the per-cell render branch ignores. Streamed chunks now append to the active cell'soutput(a dedicated per-cell tail buffer, capped like the aggregate) as they arrive, so the card shows progress live; on completion the authoritative full output overwrites the live tail.log()/phase()/display()and status ops were unaffected because they already stream via the status channel. - Fixed Escape doing nothing in the Settings text-input fields (e.g. "Python Interpreter") on terminals with the kitty keyboard protocol active (ghostty/kitty). Inside the fullscreen settings overlay the protocol reports Escape as the CSI-u sequence
\x1b[27u, which the text-input submenu's raw\x1bcompare missed;handleInputOrEscapenow decodes Escape viamatchesKey, matching every other Escape-to-cancel path. - Fixed Julia
evalgraph/plot visualization (Plots.jl, GraphRecipes, Makie, etc.) never rendering inline. Two bugs: (1) the runner'sbuild_mime_bundle/emit_errordispatchedshow/showable/showerrordirectly from the long-livedmain()loop, whose world age is frozen before any cell ran, so richshow(::IO, ::MIME"image/png", …)methods registered when a plotting package isusing-ed inside a cell were invisible —showfell back to the default struct repr (which itself threw on Julia 1.12, aborting the whole result). These calls now route throughBase.invokelatest, and thetext/plainprobe is guarded so a failing repr can no longer suppress the image MIME. (2) The default GR backend popped up a nativegksqtGUI window on each plot; the runner now defaultsGKSwstype=100(headless, overridable) so plots render only as inline PNGs, mirroring the Python runner'sMPLBACKEND=Aggdefault. - Fixed streaming output blocks incorrectly calculating preview height, preventing flickering banners
- Fixed streaming
bash/evaltool output duplicating its… (N earlier lines, showing 10 of M) (ctrl+o to expand)preview into native scrollback. The collapsed output is a sliding tail window fixed at 10 lines, so when the box outgrew the live viewport (a tall command/output under a still-live predecessor such as a parallel tool) its mutating tail scrolled above the commit window and the renderer re-committed a fresh snapshot every frame, stacking dozens of stale preview banners and chunks. The output preview is now clamped to the viewport tail (Math.min(10, previewWindowRows())) and measured in visual rows at the box's inner content width (via the newoutputBlockContentWidthhelper), so on short terminals the volatile tail shrinks to stay on-screen and is never committed. Fixes the duplication introduced when scroll-off commits were made loss-free. - Prevented
/handofffrom executing while a response is streaming to avoid session corruption - Fixed
/handoffcold-missing the provider prompt cache. Handoff generation now builds its request through the same pipeline a live turn uses (convertMessagesToLlm+Agent.buildSideRequestContext+prepareSimpleStreamOptions, via the newgenerateHandoffFromContext), so it reuses the live system prompt, normalized tools, transformed/obfuscated message history, and — critically — a stablepromptCacheKeywith a unique sidesessionId. Previously the oneshot sent no cache-routing key and skipped thetransformContext/transformProviderContextand tool/message normalization the loop applies, so its prefix never matched what the turn populated and every handoff re-read the whole context uncached. Mirrors the cache-preserving path already used by/btwand/omfg. - Fixed
/handoff(and the RPChandoffcommand) resetting the agent while a response was still streaming, which let the live turn keep emitting into the torn-down session. Manual handoff now refuses while a prompt is in flight (matching/forkand/move); the auto-handoff path is unaffected. - Fixed Exa web search requests firing back-to-back with no client-side pacing by adding a configurable
exa.searchDelayMsdelay (default 1000ms) between Exa search requests. (#3271) - Fixed
askreturning(cancelled)or aborting the tool when Escape dismissedOther (type your own)custom input; it now returns to the option selector so the user can pick a listed answer instead. (#3269) - Fixed
/goalthreshold auto-compaction skipping real sessions through three paths: per-turn supersede/drop-useless pruning no longer deflates the threshold trigger below the last provider-billed context; active-goal text stops now attempt threshold maintenance before unexpected-stop retry continuations can return from post-turn handling; and emptytoolUsestops keep the existing cleanup pass that strips the orphan assistant from active context + session history before any compaction continuation. Active-goal compaction continuations now also resolve completed retry gates before returning, preventingisRetryingfrom staying stuck after a retry succeeds over the threshold. Addedagent_end maintenance routingandAuto-compaction threshold decisiondebug logs so future no-start reports identify the exact early-return branch and the billed/stored/resolved/post-maintenance token counts that fedshouldCompact. (#3174) - Fixed active
/goalruns that never reachedagent_endbecause the model kept emitting tool calls inside one agent run. Threshold maintenance now runs between tool-call turns, compacts the live loop context in place, and suppresses queued continuations that would race the still-running goal loop. (#3174)
Removed
- Removed
append,tree, anddiffeval helper functions from Python, JavaScript, and Ruby - Removed
sort,uniq, andcountertext processing eval helpers from Python, JavaScript, and Ruby - Removed the
append(path, content),tree(path, max_depth?, show_hidden?), anddiff(a, b)eval prelude helpers from every workflow runtime (Python, JavaScript, Ruby, Julia), along with their status renderers, icon entries, and tool/docsreferences. Usewrite/readfor file mutation andtool.<name>(...)for richer filesystem operations. - Removed the
sort(text, reverse?, unique?),uniq(text, count?), andcounter(items, limit?, reverse?)eval text helpers from the Python, JavaScript, and Ruby prelude surfaces (Julia never defined them), along with the JSHelperBundle/HelperOptionsmembers anddocsreferences. Sort/dedupe/count inline in cell code instead.
@oh-my-pi/collab-web
Added
- Added support for Ruby and Julia code cells in the eval tool
Changed
- Updated the eval tool view to render the new single-cell eval args (flat
language/code/title/timeout/reset) and to highlight Ruby (rb) and Julia (jl) cells with their own syntax instead of collapsing them to Python, while still parsing legacy multi-cellcellsarrays and framedinputstrings from older transcripts.
Fixed
- Improved compatibility with legacy todo task transcripts
What's Changed
- test(eval): raised Julia prelude per-test timeout to 30s by @roboomp in #3275
- fix(compaction): trigger /goal threshold on billed context, not post-prune estimate by @roboomp in #3175
- fix(coding-agent): return to ask options after custom escape by @roboomp in #3270
- fix(web-search): pace exa search requests by @roboomp in #3272
- fix(session-selector): keep /resume header pinned after delete by @roboomp in #3285
- fix(ai): keep anthropic thinking context by @roboomp in #3290
Full Changelog: v16.1.15...v16.1.16