@oh-my-pi/pi-ai
Added
- Added regression test pinning that
openai-completionsemits athinkingblock forreasoning_contentdeltas even whendelta.contentis explicitly JSONnull(the DeepSeek-format dual-key pattern used by custom GLM/Qwen reasoning providers). See #2996.
Changed
- Improved the thinking loop guard to treat assistant text loops as retryable errors
- Refined text normalization logic to reduce false positives in the thinking loop detector
Fixed
- Fixed Ollama chat requests sending image payloads to text-only models. Image blocks are now omitted and replaced with the standard non-vision placeholder for models without vision support, while vision-capable Ollama models continue to receive images. (#3009 by @serverinspector)
- Fixed
SqliteAuthCredentialStore.close()leaking one-off prepared statements created by inlinethis.#db.prepare()calls in#authCredentialsTableExists,#readAuthSchemaVersion,#inferAuthSchemaVersion,#migrateAuthSchemaV0ToV1,#backfillCredentialIdentityKeys, andupdateAuthCredential. Each statement is now wrapped intry/finallywithstmt.finalize(), and theclose()method finalizes#insertUsageCostStmtand#listUsageCostsStmtwhich were previously missed. This caused EBUSY on Windows when tests tried to delete temp dirs containing open SQLite handles.
@oh-my-pi/pi-catalog
Fixed
- Marked Ollama Cloud catalog models to omit on-the-wire output-token caps, preventing context-window-sized
num_predictvalues from causing HTTP 400s for models whose true output cap is not discoverable. (#2984) - Fixed
readModelCache/writeModelCacheusing a process-global shared database even when a customdbPathwas provided. Custom-path cache operations now open and close a per-call database viawithModelCacheDb, preventing leaked SQLite handles on Windows
@oh-my-pi/pi-coding-agent
Changed
- Refactored Perplexity authentication logic to prioritize cookies over OAuth in search operations
- Updated
tokencommand to correctly display active Perplexity OAuth tokens when present
Fixed
- Enabled auto-retry for AI "thinking loop" errors encountered during model inference
- Cleared stale error banners automatically when triggered by an auto-retry recovery phase
- Preserved bundled
omitMaxOutputTokenspolicy when fresh cached provider discovery rows replace Ollama Cloud catalog models, so stalemodels.dbentries cannot re-enable context-window-sizednum_predictvalues. (#2984) - Normalized cached-only Ollama Cloud discovery rows to omit on-the-wire output-token caps even when the cached model id has no bundled catalog entry. (#2984)
- Fixed Ollama, LM Studio, and llama.cpp (plus loopback vLLM / sglang servers) reprocessing the full prompt on every turn because
provider.appendOnlyContext: autoonly recognized DeepSeek and Xiaomi as prefix-cache providers. The auto-detect now enables append-only mode forollama,ollama-cloud,lm-studio,llama.cpp, and any baseUrl resolving to a loopback/RFC1918/.localhost, so the system prompt + tool catalogue + prior-turn message bytes stay byte-stable across turns and llama.cpp's KV-cache prefix reuse can hit (#3033). - Isolated mnemopi's local embedding provider in a dedicated
Bun.spawnsubprocess soonnxruntime-nodeandfastembednever load into the main agent process. Previouslymemory.backend: mnemopicrashed Bun on Windows — standalone binaries faulted in the NAPIprocess.dlopenconstructor at session start, npm installs faulted in the NAPI finalizer at process teardown. Mirrors the tiny-model isolation pattern from #1607; the parent SIGKILLs the child on dispose so the destructor never runs in either address space (#3031). - Fixed image tool registration resolving image provider credentials during session startup, so broken or slow
google-antigravityOAuth state no longer blocks sessions that never invokegenerate_image(#3036). - Fixed LSP client returning
-32601 Method not foundfor defined server→client requests (window/showMessageRequest,window/showDocument,workspace/{semanticTokens,inlayHint,codeLens,codeAction,diagnostic}/refresh). Servers that stall waiting for a real reply (same failure mode as #3029) now receive the spec no-op result (#3044). - Fixed WebP images being sent unchanged to
local-servervision models, which can fail through llama.cpp/STB-backed decoders that do not support WebP (#2922). - Made
getSettingsListTheme,getEditorTheme,getSelectListTheme, andgetSymbolThemereturn a plain ASCII fallback instead of crashing with "undefined is not an object (evaluating 'theme.fg')" when the globalthemeis undefined — e.g. when a plugin calls them beforeinitTheme()completes or from a separate module instance under npm-global installs. (#2998) - Hardened TTS, STT, and tiny-title worker IPC
send()paths against async EPIPE rejections:Subprocess.send()is now wrapped so neither a synchronous "process exited" throw nor an asynchronous EPIPE rejection (when the pipe breaks between exit being observed and the next send) can escape as a fatal unhandled rejection. A dying Kokoro/TTS/STT worker can no longer crash the whole agent session mid-task. (#2997) - Fixed Windows test failures caused by path handling: tests now use
pathToFileURL,path.resolve, andpath.joininstead of hard-coded POSIX paths;shortenPath()normalizes backslashes to forward slashes after~and respects home directory boundaries; shell-escaped interpolated paths in bash tool tests to prevent Git Bash eating backslashes - Fixed
HistoryStorage.resetInstance()leaking its SQLite database handle on Windows by adding a#close()method that finalizes all prepared statements and closes the database;AgentStoragegained the sameresetInstance()/#close()pattern - Fixed
createAgentSessionleaking the internally-createdAuthStoragewhen session construction fails before the session takes ownership, causing EBUSY on Windows temp dir cleanup - Fixed
MnemopiBackend.removeDbFiles()throwing on Windows when the database handle is still being released; it is now truly best-effort (logs failures instead of silently swallowing) - Fixed Windows EBUSY test failures by replacing raw
fs.rmSync/fs.rmcleanup withTempDir(which retries) and best-effort.catch(() => {})where SQLite handles outlive the test - Fixed
TempDirprefix convention: non-@prefixes created temp dirs relative to cwd instead ofos.tmpdir(), causing module resolution failures on Windows - Fixed git line-ending mismatches in autoresearch tests by setting
core.autocrlf falsein test repo initialization - Fixed Bedrock inference-profile ARN models being dropped from the allowed-model set when models were scoped via
enabledModels, the SDK, or ACP, so an accepted ARN no longer resolves to an empty selection. (#3006)
@oh-my-pi/pi-mnemopi
Added
- Exposed
setLocalModelInitializer(and theLocalEmbeddingModel,LocalModelInitializer,LocalModelInitOptions,StandardEmbeddingModeltypes) so hosts can route fastembed loads through a dedicated subprocess and keeponnxruntime-node's NAPI constructor + finalizer out of their own address space. Same wipe semantics as the existingsetLocalModelInitializerForTestsseam; the agent CLI uses it to crash-proof Windows whenmemory.backend: mnemopiis enabled (#3031).
Fixed
- Fixed background fact extraction skipping runtime-configured remote LLM endpoints when
MNEMOPI_LLM_BASE_URLwas unset, soremember(..., { extract: true })now stores remote-distilled facts frommnemopi.llmconfig instead of falling back to regex heuristics. (#3041) - Fixed local fastembed startup on macOS ARM64 by letting
fastembed@2.1.0install its matchingonnxruntime-node@1.21.0native runtime instead of forcing1.26.0, and by repairing missing tokenizer sidecars from the upstream Hugging Face model cache when a stale fastembed archive lacks them. (#3054)
@oh-my-pi/pi-utils
Changed
- Expanded the
TempDirWindows retry window from 4×10ms to 40×25ms (1s total) to accommodate SQLite WAL/SHM file handle release delays
Fixed
- Made EPIPE rejections from IPC
send()to worker subprocesses (syscall: "send") non-fatal: the globalunhandledRejectionhandler now logs and continues instead of terminating the session when an optional subsystem's pipe breaks. A broken optional subsystem (TTS/STT/tiny-title/MCP) can no longer crash the whole agent session mid-task. (#2997)
What's Changed
- fix(providers): accept Bedrock inference profile ARNs by @roboomp in #3006
- fix(ai): omit Ollama images for text-only models by @serverinspector in #3009
- fix: Windows test failures — path handling, EBUSY, SQLite handle leaks by @oldschoola in #3019
- test(ai): pin reasoning_content contract for null content deltas (#2996) by @oldschoola in #3020
- fix(ipc): harden worker IPC send against async EPIPE rejections (#2997) by @oldschoola in #3021
- fix(coding-agent): re-encode WebP for local-server models by @danzaio in #3023
- fix(coding-agent): guard theme getters against undefined theme before init (#2998) by @oldschoola in #3026
- fix(mnemopi): isolate local embeddings worker in subprocess by @roboomp in #3034
- fix(coding-agent): delay image credential lookup until execution by @roboomp in #3037
- fix(coding-agent): auto-enable append-only context for Ollama and local servers by @roboomp in #3038
- fix(mnemopi): use runtime remote LLM for extraction by @roboomp in #3042
- fix(catalog): omit Ollama Cloud output caps by @wolfiesch in #3043
- fix(lsp): reply to defined server requests with spec no-ops by @roboomp in #3045
- fix(mnemopi): restore local fastembed runtime on macOS by @roboomp in #3055
Full Changelog: v16.1.2...v16.1.3