can1357/oh-my-pi v15.6.0 on GitHub

@oh-my-pi/pi-ai

Fixed

Fixed Anthropic adaptive-thinking replay preserving signed thinking blocks on the latest abandoned tool-use assistant message, avoiding thinking blocks in the latest assistant message cannot be modified 400s. (#1531)

@oh-my-pi/pi-coding-agent

Added

Added prompt-mode autocomplete for supported internal URL schemes (skill://, rule://, agent://, artifact://, local://, memory://, and omp://) so typing those tokens now suggests existing resources as completion candidates
Added fuzzy matching and ranked suggestion ordering for internal URL completion, including rule and skill descriptions, with accepted completion replacing just the typed token and inserting the chosen URL followed by a space
Changed internal URL completions now include nested local:// path suggestions from the configured local workspace
Added Mnemosyne memory inference model selection with an online mode or local transformers.js options (qwen3-1.7b, gemma-3-1b, qwen2.5-1.5b, lfm2-1.2b) so memory extraction and consolidation can run via the shared tiny-model worker
Changed memory tiny-model handling to route local memory prompts through the same queueed tiny-model worker pipeline with bounded completion output
Added a Providers → Tiny Model setting for session titles, defaulting to the online pi/smol path with five optional local CPU transformers.js models. A local model — and the one-time @huggingface/transformers runtime install in compiled binaries — is downloaded and loaded only when explicitly selected (or via omp tiny-models download); the default online path never spawns the title worker for inference. Selecting a local model adds a delayed pi/smol fallback so titles never block, plus in-chat download progress.
Added a persistent live agent roster pinned below the editor (focus it with Ctrl+S or Alt+Down), including view-as switching into delegated agent sessions with human-readable delegate names and UI pinning to suppress idle reaping while viewed. The roster stays hidden until at least one delegated agent exists and releases focus back to the editor once the last one is gone.
Recorded the originating session ID alongside each prompt in history.db (new session_id column, surfaced as HistoryEntry.sessionId), so recalled prompts can be traced back to the session they came from. Existing history databases gain the column automatically on next launch.
Added compact inline TUI renderers for the retain, recall, and reflect memory tools. retain now shows one themed bullet line per stored item (truncated to width) under a status header with the stored/queued count, and recall/reflect collapse to a single query header (recall reports the match count and hides recalled memories until expanded) instead of dumping the raw JSON argument tree.
Added a randomly picked tip beneath the welcome screen, sourced from an embedded tips.txt (one tip per line). The line is italicized with a purple Tip: label and a dimmed light-blue body, and the tip is chosen once per welcome instance so intro-animation and LSP re-renders don't shuffle it.
Added a Mnemosyne-only memory_edit agent tool for updating, forgetting, or invalidating recalled memories by id, and added /memory stats plus /memory diagnose slash commands for backend maintenance visibility.
Added an orchestrate magic keyword that mirrors ultrathink: dropping the standalone word in a message paints it with a cool teal→violet gradient in the editor and appends a hidden system notice that switches the model into the multi-phase, parallel-subagent orchestration contract. Matching is word-bounded and case-insensitive, so orchestrated/orchestrating never trigger it.
Added a model-tier slider to the plan-approval prompt ("Plan mode - next step"). Left/right arrows move it from any list position to pick which configured role model (cycleOrder, e.g. smol › default › slow) executes the approved plan, with each tier colored by its role and the resolved model name shown beneath the track. The chosen tier is applied before dispatch and carries through the fresh/compacted execution session; the slider is hidden when fewer than two role models resolve.

Changed

Changed irc to treat the attached human as a first-class User peer, merging human prompts into irc call User with optional structured question payloads and adding /dm <agent> <message> for user-to-agent routing without switching views.
Changed the --resume session picker (and the in-session resume selector) to also rank sessions by prompt-history matches from history.db, not just the session-list metadata. Because the session list only indexes the first 4KB of each file, this surfaces sessions by prompts typed deep into long conversations. Sessions matched by both signals lead, then metadata-only matches, then history-only matches — no metadata match is dropped.
Changed the task tool's streaming call preview to list each dispatched agent's id and UI description as a tree instead of a bare N agents count, so the individual agents are visible while the tool-call arguments are still streaming. The collapsed view caps at 12 entries (… N more agents); the expanded view shows all.
Changed Mnemosyne recall tool output to include memory ids for explicit recall results so agents can target memory_edit; auto-injected memory context and reflect remain id-free.
Changed the system prompt to advertise memory://root only when the local memory backend is active.
Changed todo_write result rendering to animate completed items in place: the checkbox flips checked first, then the strikethrough reveals across the task text.

Removed

Removed the standalone ask, task, and yield tools along with their obsolete prompts, docs, and tests; delegation now routes through persistent delegate agents plus IRC coordination.
Removed the /orchestrate slash command; orchestration is now triggered by the orchestrate keyword (see Added) so the contract rides alongside the user's own prompt instead of replacing it.
Removed the sticky Todos panel all-done drop/collapse animation; completed todo state now stays visible until the next explicit todo update changes it.

Fixed

Fixed Mnemosyne session shutdown to flush queued memory extractions before exit so the last turn’s facts are not lost
Fixed a native crash (malloc: pointer being freed was not allocated / NAPI FATAL ERROR) when quitting after the local transformers.js title model had run. The tiny-title worker no longer calls pipeline.dispose() on shutdown — disposing the onnxruntime session freed native memory that Bun's worker/NAPI teardown then freed again. The worker is torn down immediately after, so the OS reclaims the model memory regardless.
Fixed the tiny-title download progress bar flashing on every first message even when the local model was already downloaded. A cached model emits the same download/progress events as a real download, so the bar is now revealed only when in-flight progress events keep arriving past a short grace window — cache hits finish (or fall silent during onnxruntime init) before then and never show the bar.
Fixed the Mnemosyne memory backend lifecycle so auto-retain counts the full session transcript, delegated agents inherit the parent Mnemosyne state, /memory clear removes scoped project-bank databases, session disposal closes Mnemosyne SQLite handles, session switches rekey/reset Mnemosyne tracking, and project bank names include an absolute-root hash with safe bank-name sanitization.
Fixed the streaming edit preview showing no diff for single-line hashline edits. The preview-diff coalescing keyed only on the arg text, so the final (args-complete) pass — which computes an untrimmed diff — was skipped because the payload was byte-identical to the last streamed chunk whose trailing line had been trimmed. The dedup key now pairs the streaming state with a content hash.
Fixed Esc in a delegated agent view returning to the main session instead of aborting the delegated agent's active turn.
Fixed the subagent stats line to separate the cost with the theme dot separator (was a stray literal .) and to render context usage as <pct>%/<window> (e.g. 21.3%/272K) matching the status line gauge, via a shared formatContextUsage helper now used by the footer, status-line segment, session observer overlay, and task renderer.
Fixed the agent roster staying pinned under the editor when all delegated agents are idle or dormant; it now reappears when explicitly focused with Alt+Down / session observe.
Fixed selector-style UI components to honor tui.select.up and tui.select.down keybindings instead of hard-coding raw Up/Down arrow bytes (#1535).
Fixed the bash (and recipe) tool result footer not rendering for failed commands. A non-zero exit threw a ToolError, which dropped the result details, so the styled ⟨Wall … | Timeout …⟩ footer was replaced by the raw Wall time: … seconds / Command exited with code N lines. Non-zero exits now resolve as a non-throwing error result that keeps wallTimeMs/timeoutSeconds/exitCode, and the footer shows ⟨Wall … | Timeout … | Exit: N⟩ with the textual notices folded out of the output pane. Aborts, timeouts, and missing-exit-status still throw as before.
Fixed selector-style UI components to honor tui.select.up and tui.select.down keybindings instead of hard-coding raw Up/Down arrow bytes (#1535).

@oh-my-pi/pi-mnemosyne

Added

Added llm.extractionPrompt runtime option to override the fact-extraction prompt template using {text} and {lang} placeholders
Added llm.consolidationPrompt runtime option to override the consolidation sleep prompt template using {memories}, {source}, and {memory_count} placeholders
Published @oh-my-pi/pi-mnemosyne to npm: the local SQLite memory engine is now built, checked, tested, and released through the monorepo CI pipeline alongside the other workspace packages.
Exported the diagnostic inspector as the @oh-my-pi/pi-mnemosyne/diagnose subpath for coding-agent memory maintenance commands.
Added flushExtractions() (on Mnemosyne, BeamMemory, and as a module-level export) to drain in-flight background fact extraction; used by tests and graceful shutdown so facts are persisted before the database closes.

Changed

Changed fact extraction to prefer a configured runtime LLM completion path before host extraction, with automatic fallback when the configured completion returns no output or fails

Fixed

Fixed rememberBatch(..., { extract: true }) to run background fact extraction for batch uploads (including per-item extract flags) so extracted facts are generated and recallable after extraction
Fixed extract: true fact extraction to continue safely when no LLM is configured by turning extraction failures into no-op background tasks
Fixed configured LLM fact extraction by using temperature 0 so re-ingesting the same text is deterministic and avoids near-duplicate extractions
Fixed remember(..., { extract: true }) silently dropping the flag: it now schedules the LLM fact extractor (extractFactsSafe) over the stored content and persists the extracted facts so they become recallable. Previously the LLM extractor had no production callers and extract was dead.

@oh-my-pi/pi-natives

Changed

Changed npm publishing to ship @oh-my-pi/pi-natives as a small core loader package plus per-platform optional dependency leaf packages, so installs fetch only the host platform's native addon instead of every supported .node binary.

@oh-my-pi/pi-tui

Added

Added autocomplete triggering for internal URL scheme tokens such as local:// and skill:// while typing in the editor

Fixed

Fixed streaming output staying invisible in Windows Terminal + WSL2 until the window was minimized + restored. The 15.5.14 WSL branch of requiresNativeViewportProofForReplay treated an unknown native viewport state as "scrolled into history" — but ProcessTerminal.isNativeViewportAtBottom can only return a real answer through kernel32.dll FFI, which a Linux user-space process inside WSL cannot load, so the probe was permanently undefined. Every row-inserting structural mutation (each new streaming token row above the bottom-anchored prompt) was therefore classified as deferredMutation and emitted zero bytes. Any geometry change (resize/minimize/restore) bypassed the gate via a different render intent, which is why the output became visible only on window resize. The WSL clause is removed; on platforms where the probe cannot answer, unknown is treated as at-bottom (the pre-15.5.14 behaviour) so the live render path runs again. Native Win32 keeps the conservative "assume scrolled when unknown" heuristic since kernel32 FFI does succeed there and unknown means the probe transiently failed. (#1534)

@oh-my-pi/pi-utils

Added

Added an XDG-aware tiny-title model cache directory helper for coding-agent local title models.

What's Changed

fix(ai): preserve latest Anthropic thinking signatures by @roboomp in #1532
fix(tui): rendered WSL streaming output without resize trigger by @roboomp in #1537
fix(tui): respect selector navigation keybindings by @roboomp in #1538
fix(ai): use Copilot model context window limits by @roboomp in #1540

Full Changelog: v15.5.15...v15.6.0