can1357/oh-my-pi v16.1.3 on GitHub

@oh-my-pi/pi-ai

Added

Added regression test pinning that openai-completions emits a thinking block for reasoning_content deltas even when delta.content is explicitly JSON null (the DeepSeek-format dual-key pattern used by custom GLM/Qwen reasoning providers). See #2996.

Changed

Improved the thinking loop guard to treat assistant text loops as retryable errors
Refined text normalization logic to reduce false positives in the thinking loop detector

Fixed

Fixed Ollama chat requests sending image payloads to text-only models. Image blocks are now omitted and replaced with the standard non-vision placeholder for models without vision support, while vision-capable Ollama models continue to receive images. (#3009 by @serverinspector)
Fixed SqliteAuthCredentialStore.close() leaking one-off prepared statements created by inline this.#db.prepare() calls in #authCredentialsTableExists, #readAuthSchemaVersion, #inferAuthSchemaVersion, #migrateAuthSchemaV0ToV1, #backfillCredentialIdentityKeys, and updateAuthCredential. Each statement is now wrapped in try/finally with stmt.finalize(), and the close() method finalizes #insertUsageCostStmt and #listUsageCostsStmt which were previously missed. This caused EBUSY on Windows when tests tried to delete temp dirs containing open SQLite handles.

@oh-my-pi/pi-catalog

Fixed

Marked Ollama Cloud catalog models to omit on-the-wire output-token caps, preventing context-window-sized num_predict values from causing HTTP 400s for models whose true output cap is not discoverable. (#2984)
Fixed readModelCache/writeModelCache using a process-global shared database even when a custom dbPath was provided. Custom-path cache operations now open and close a per-call database via withModelCacheDb, preventing leaked SQLite handles on Windows

@oh-my-pi/pi-coding-agent

Changed

Refactored Perplexity authentication logic to prioritize cookies over OAuth in search operations
Updated token command to correctly display active Perplexity OAuth tokens when present

Fixed

Enabled auto-retry for AI "thinking loop" errors encountered during model inference
Cleared stale error banners automatically when triggered by an auto-retry recovery phase
Preserved bundled omitMaxOutputTokens policy when fresh cached provider discovery rows replace Ollama Cloud catalog models, so stale models.db entries cannot re-enable context-window-sized num_predict values. (#2984)
Normalized cached-only Ollama Cloud discovery rows to omit on-the-wire output-token caps even when the cached model id has no bundled catalog entry. (#2984)
Fixed Ollama, LM Studio, and llama.cpp (plus loopback vLLM / sglang servers) reprocessing the full prompt on every turn because provider.appendOnlyContext: auto only recognized DeepSeek and Xiaomi as prefix-cache providers. The auto-detect now enables append-only mode for ollama, ollama-cloud, lm-studio, llama.cpp, and any baseUrl resolving to a loopback/RFC1918/.local host, so the system prompt + tool catalogue + prior-turn message bytes stay byte-stable across turns and llama.cpp's KV-cache prefix reuse can hit (#3033).
Isolated mnemopi's local embedding provider in a dedicated Bun.spawn subprocess so onnxruntime-node and fastembed never load into the main agent process. Previously memory.backend: mnemopi crashed Bun on Windows — standalone binaries faulted in the NAPI process.dlopen constructor at session start, npm installs faulted in the NAPI finalizer at process teardown. Mirrors the tiny-model isolation pattern from #1607; the parent SIGKILLs the child on dispose so the destructor never runs in either address space (#3031).
Fixed image tool registration resolving image provider credentials during session startup, so broken or slow google-antigravity OAuth state no longer blocks sessions that never invoke generate_image (#3036).
Fixed LSP client returning -32601 Method not found for defined server→client requests (window/showMessageRequest, window/showDocument, workspace/{semanticTokens,inlayHint,codeLens,codeAction,diagnostic}/refresh). Servers that stall waiting for a real reply (same failure mode as #3029) now receive the spec no-op result (#3044).
Fixed WebP images being sent unchanged to local-server vision models, which can fail through llama.cpp/STB-backed decoders that do not support WebP (#2922).
Made getSettingsListTheme, getEditorTheme, getSelectListTheme, and getSymbolTheme return a plain ASCII fallback instead of crashing with "undefined is not an object (evaluating 'theme.fg')" when the global theme is undefined — e.g. when a plugin calls them before initTheme() completes or from a separate module instance under npm-global installs. (#2998)
Hardened TTS, STT, and tiny-title worker IPC send() paths against async EPIPE rejections: Subprocess.send() is now wrapped so neither a synchronous "process exited" throw nor an asynchronous EPIPE rejection (when the pipe breaks between exit being observed and the next send) can escape as a fatal unhandled rejection. A dying Kokoro/TTS/STT worker can no longer crash the whole agent session mid-task. (#2997)
Fixed Windows test failures caused by path handling: tests now use pathToFileURL, path.resolve, and path.join instead of hard-coded POSIX paths; shortenPath() normalizes backslashes to forward slashes after ~ and respects home directory boundaries; shell-escaped interpolated paths in bash tool tests to prevent Git Bash eating backslashes
Fixed HistoryStorage.resetInstance() leaking its SQLite database handle on Windows by adding a #close() method that finalizes all prepared statements and closes the database; AgentStorage gained the same resetInstance()/#close() pattern
Fixed createAgentSession leaking the internally-created AuthStorage when session construction fails before the session takes ownership, causing EBUSY on Windows temp dir cleanup
Fixed MnemopiBackend.removeDbFiles() throwing on Windows when the database handle is still being released; it is now truly best-effort (logs failures instead of silently swallowing)
Fixed Windows EBUSY test failures by replacing raw fs.rmSync/fs.rm cleanup with TempDir (which retries) and best-effort .catch(() => {}) where SQLite handles outlive the test
Fixed TempDir prefix convention: non-@ prefixes created temp dirs relative to cwd instead of os.tmpdir(), causing module resolution failures on Windows
Fixed git line-ending mismatches in autoresearch tests by setting core.autocrlf false in test repo initialization
Fixed Bedrock inference-profile ARN models being dropped from the allowed-model set when models were scoped via enabledModels, the SDK, or ACP, so an accepted ARN no longer resolves to an empty selection. (#3006)

@oh-my-pi/pi-mnemopi

Added

Exposed setLocalModelInitializer (and the LocalEmbeddingModel, LocalModelInitializer, LocalModelInitOptions, StandardEmbeddingModel types) so hosts can route fastembed loads through a dedicated subprocess and keep onnxruntime-node's NAPI constructor + finalizer out of their own address space. Same wipe semantics as the existing setLocalModelInitializerForTests seam; the agent CLI uses it to crash-proof Windows when memory.backend: mnemopi is enabled (#3031).

Fixed

Fixed background fact extraction skipping runtime-configured remote LLM endpoints when MNEMOPI_LLM_BASE_URL was unset, so remember(..., { extract: true }) now stores remote-distilled facts from mnemopi.llm config instead of falling back to regex heuristics. (#3041)
Fixed local fastembed startup on macOS ARM64 by letting fastembed@2.1.0 install its matching onnxruntime-node@1.21.0 native runtime instead of forcing 1.26.0, and by repairing missing tokenizer sidecars from the upstream Hugging Face model cache when a stale fastembed archive lacks them. (#3054)

@oh-my-pi/pi-utils

Changed

Expanded the TempDir Windows retry window from 4×10ms to 40×25ms (1s total) to accommodate SQLite WAL/SHM file handle release delays

Fixed

Made EPIPE rejections from IPC send() to worker subprocesses (syscall: "send") non-fatal: the global unhandledRejection handler now logs and continues instead of terminating the session when an optional subsystem's pipe breaks. A broken optional subsystem (TTS/STT/tiny-title/MCP) can no longer crash the whole agent session mid-task. (#2997)

What's Changed

fix(providers): accept Bedrock inference profile ARNs by @roboomp in #3006
fix(ai): omit Ollama images for text-only models by @serverinspector in #3009
fix: Windows test failures — path handling, EBUSY, SQLite handle leaks by @oldschoola in #3019
test(ai): pin reasoning_content contract for null content deltas (#2996) by @oldschoola in #3020
fix(ipc): harden worker IPC send against async EPIPE rejections (#2997) by @oldschoola in #3021
fix(coding-agent): re-encode WebP for local-server models by @danzaio in #3023
fix(coding-agent): guard theme getters against undefined theme before init (#2998) by @oldschoola in #3026
fix(mnemopi): isolate local embeddings worker in subprocess by @roboomp in #3034
fix(coding-agent): delay image credential lookup until execution by @roboomp in #3037
fix(coding-agent): auto-enable append-only context for Ollama and local servers by @roboomp in #3038
fix(mnemopi): use runtime remote LLM for extraction by @roboomp in #3042
fix(catalog): omit Ollama Cloud output caps by @wolfiesch in #3043
fix(lsp): reply to defined server requests with spec no-ops by @roboomp in #3045
fix(mnemopi): restore local fastembed runtime on macOS by @roboomp in #3055

Full Changelog: v16.1.2...v16.1.3