HKUDS/nanobot v0.2.0 on GitHub

🐈 nanobot v0.2.0 is here 🎉 — 105 PRs merged, 20 new contributors. The agent learned to hold a goal.

The headline is /goal. Mark a thread as a sustained objective with long_task, and the active goal stays pinned in Runtime Context every turn — surviving compaction, surviving long tool chains, surviving the model's own forgetfulness — until you call complete_goal. The wall-clock timeout widens automatically while a goal is active; streaming requests fall back to an idle timeout instead of a hard wall, so a model that's still thinking doesn't get killed mid-thought.

The second story is the WebUI growing up. After several releases as a source-only preview, it now ships inside the wheel — pip install nanobot-ai and you have it. Settings and BYOK got a full redesign, the slash palette is localized, LAN access is gated by a token, reasoning streams live to the chat, and a brand-new image-generation tool turns "draw me X" into an inline preview without leaving the conversation.

The third story is the engine room. The agent loop got a real refactor — AgentLoop.from_config() for clean embedding, _process_message rewritten as a functional state machine, archived summary moved into the system prompt for KV cache stability, tools converted to a self-describing plugin architecture, ask_user and GlobTool retired in favor of cleaner replacements. Five new providers (AWS Bedrock Converse, NVIDIA NIM, LongCat, Atomic Chat, MiMo) join the lineup with fallback_models as a safety net so a single flaky endpoint can't take a turn down. Plus four [security] fixes — SSRF, media path confinement, softer workspace boundaries — and chat-native pairing so DM approvals finally happen in the chat instead of in a config file.

Highlights

/goal and long-running tasks — agent with a memory of why — The new long_task tool, paired with /goal and complete_goal, marks a thread as a sustained objective. The active goal is mirrored in Runtime Context every turn so the agent stays anchored even after compaction, the WebUI surfaces the goal in the chat header, and the LLM wall timeout is automatically widened while a goal is active so longer reasoning passes don't get killed mid-thought. Core agents and subagents both honor the longer budget; streaming requests fall back to an idle timeout instead of a hard wall clock so a model that's still emitting tokens won't be prematurely cut off. (#3788, #3855)
Image generation, end to end — A new image-generation tool plus a WebUI image mode let you go from prompt to picture without leaving the chat. Generated images render inline with rounded previews; replay-window and dedup paths were tightened so images don't double-deliver across long sessions, and the consolidation pass now respects the replay window when hiding history. (#3695, #3687)
WebUI shipped in the wheel + a year's worth of polish — pip install nanobot-ai now bundles the WebUI: enable the WebSocket channel, run nanobot gateway, open the browser. No cd webui && bun run build required. After several releases under a "preview" label, the WebUI is now a packaged surface. Inside it: redesigned settings and BYOK flow (including BYOK web search), localized slash palette, model preset badge that stays in sync across slash commands and config reloads, streamed reasoning rendered live, image previews, LAN access gated by tokenIssueSecret, default-to-new-chat on load, scroll preservation on settings return, a crypto.randomUUID shim for non-secure-context LAN use, and dropped eager markdown preload to cut first-paint cost. (#3653, #3661, #3703, #3709, #3656, #3658, #3733, #3759, #3782)
Five new providers and a fallback safety net — Native AWS Bedrock Converse lands as a first-class provider (#3574), with a follow-up that preserves Bedrock tool config across history (#3758). NVIDIA NIM (#3707), LongCat via OpenAI-compatible routing (#3114), Atomic Chat as a local OpenAI-compatible target (#3750), and MiMo with proper thinking-control wiring (#3734, #3851) round out the lineup. On top of the wider stack, fallback_models lets you list secondary models that take over when the primary fails (#3756), DeepSeek reasoning history is back-filled instead of dropped (#3616, #3560), Codex prompt cache keys stabilized (#3793), and Anthropic auto-falls back to streaming on long-request errors (#3579).
Model presets and runtime switching — ModelPresetConfig lets you name model + provider bundles in config and swap between them at runtime via /model (or the WebUI badge). Presets sync across slash commands, config reloads, and settings changes, so the model badge always matches what the next turn will hit. (#3714)
Core refactor — a cleaner agent loop, a smaller surface — AgentLoop.from_config() centralizes loop assembly so embedders stop reaching into private internals (#3708). _process_message was rewritten as a functional state machine with explicit transitions instead of nested branches (#3715). The archived conversation summary moved into the system prompt to keep KV caches stable across compactions (#3711). Tools became a self-describing plugin architecture (#3729). ask_user was removed in favor of structured message-tool choices (#3757), and GlobTool was retired in favor of read_file glob support (#3841). Vulture- and coverage-verified dead code was excised (#3755, #3719). Logging now preserves tracebacks and carries channel context (#3651, #3678). Ruff F rules are fully enforced in CI (#3672), and the agent test suite was expanded and restructured (#3766). New CLAUDE.md and .agent/ guides give AI contributors a stable on-ramp (#3534, #3860).
Pairing, DM approvals, and security hardening — Chat-native pairing (#3774) lets you approve DMs from inside chat instead of editing config. On the safety front: SSRF blocked in DingTalk outbound media (#3569), Feishu downloaded-media filenames confined to safe paths (#3789), local media attachments confined for the message tool (#3842), SSRF guard recovery softened so a transient miss doesn't poison subsequent turns (#3635), and workspace boundary violations get a retry-throttled soft warning instead of crashing the loop (#3614). Telegram silently ignores unauthorized senders instead of leaking error responses (#3629).
Memory, dream, and session durability — Cron jobs.json gained atomic writes with corrupt-store detection (#3606). Dream cursor only advances on completed batches (#3631) and restores correctly with memory state on resume (#3660). Replay-window hidden history is now properly consolidated (#3687). Workspace and tool-state changes survive restarts more reliably end-to-end.
Channel and platform fixes — Feishu group threads honor reply chains and topic isolation (#3547, #3704, #3747, #3775), Matrix skips pre-startup events and stops the sync loop on irrecoverable auth (#3575, #3578), WhatsApp voice messages download cleanly (#3607), Wecom preserves real filenames (#3751), Telegram quiets unauthorized senders (#3629), and Weixin raises on send failures so messages don't silently drop (#3659). On Windows: UNC path support in shell extraction (#3764). On the CLI: surrogate code points sanitized before the message bus (#3697), retry-wait messages no longer garble interactive output (#3705, #3609), and nanobot provider logout lands as a proper command (#3612).
Smaller things you'll feel — Configurable bot_name / bot_icon (#3730), configurable toolHintMaxLength (#3641), real SSE streaming restored on the OpenAI-compatible API (#3677), Whisper transcription retries on transient failures (#3646), sequential MCP server connects to stop CPU spin (#3640), MCP HTTP probe before connect (#3740), Brave search backoff under rate limits (#3840), sender_id in Runtime Context for user-aware responses (#3549), runtime context appended after user content for cache stability (#3844), origin_message_id outbound deduplication (#3561), and a Python SDK RunResult that finally exposes tools_used and messages (#3620).

Community

Heartfelt thanks to everyone who shipped v0.2.0 — 105 PRs, 33 contributors, and a huge welcome to 20 first-time contributors. Every review, patch, and bug report helped; this release is a shared win. 🎉

What's Changed

fix(feishu): streaming card and tool hint respect reply_to_message in groups by @04cb in #3543
Revert "fix(feishu): streaming card and tool hint respect reply_to_message in groups" by @Re-bin in #3548
fix(feishu): respect reply_to_message for group threads by @boogieLing in #3547
fix(agent): respect configured max iterations for subagents by @boogieLing in #3532
feat(provider): add native AWS Bedrock Converse support by @Re-bin in #3574
feat(skill): add nanobot upgrade wizard skill by @chengyongru in #3539
bugfix(#3751): ReadFileTool says "File unchanged since last read:" across different sessions by @LZDQ in #3576
refactor: replace try-except blocks with contextlib.suppress for cleaner error handling across multiple files by @JackLuguibin in #3566
fix(matrix): skip events received before bot startup by @coldxiangyu163 in #3575
fix(matrix): don't send empty room messages from blank progress callbacks by @halldorjanetzko in #3573
[security] fix(dingtalk): block SSRF in outbound media fetches by @Hinotoi-agent in #3569
fix(matrix): correct allow_room_mentions default type by @spinvettle in #3563
fix: API stream lifecycle for tool-backed requests by @boogieLing in #3555
feat(context): Add sender_id to LLM runtime context for user-aware responses by @yorkhellen in #3549
fix: origin_message_id support and outbound deduplication by @tongtianli03-code in #3561
fix: adjust DeepSeek reasoning mode check condition by @JiajunBernoulli in #3560
fix(web_fetch): sanitize URL to strip markdown backticks and quotes before validation by @XJPeng12 in #3528
fix: strip partial think tags in streaming output by @hongshunanhai in #3577
fix(matrix): stop sync loop on irrecoverable auth errors by @coldxiangyu163 in #3578
fix(anthropic): auto-fallback to stream on long-request error by @coldxiangyu163 in #3579
fix(helpers): restore tiktoken fallback in estimate_prompt_tokens_chain by @yorkhellen in #3582
feat(provider): add LongCat via OpenAI-compatible backend by @morandot in #3114
fix: allow_patterns take priority over deny_patterns in ExecTool by @chengyongru in #3594
Improve beta WebUI turn completion and chat isolation by @ramonpaolo in #3583
fix(cli): stop provider retry messages garbling interactive output by @04cb in #3609
fix(cron): atomic write for jobs.json + don't silently overwrite corrupt store by @hussein1362 in #3606
fix(runner): soft workspace boundary with retry throttle by @Re-bin in #3614
fix(agent): prevent safety guard false positives and streamed message drop by @chengyongru in #3613
fix(bridge): support WhatsApp voice message download by @yorkhellen in #3607
feat(cli): add provider logout command by @chengyongru in #3612
fix: backfill DeepSeek reasoning_content history instead of dropping it (#3554, #3584) by @04cb in #3616
feat(agent): limit concurrent subagent execution by @chengyongru in #3634
fix: only advance dream_cursor on completed batches to prevent silent loss by @JiajunBernoulli in #3631
fix: return absolute path for downloaded Feishu media files by @futurist in #3632
fix(telegram): ignore unauthorized users silently by @kaseru in #3629
fix(sdk): populate RunResult.tools_used and RunResult.messages by @chengyongru in #3620
fix(agent): soften SSRF guard recovery by @Re-bin in #3635
refactor(logging): preserve tracebacks and add channel context by @chengyongru in #3651
fix(agent): gate provider progress deltas for non-streaming channels by @boogieLing in #3645
feat(config): add toolHintMaxLength to control tool hint truncation by @chengyongru in #3641
fix: use sequential MCP server connections to prevent CPU spin by @chengyongru in #3640
fix(transcription): retry Whisper calls on transient failures by @chengyongru in #3646
feat(webui): polish chat layout and titles by @Re-bin in #3653
fix(webui): allow LAN access when host is 0.0.0.0 by @chengyongru in #3656
fix(webui): require token_issue_secret for LAN bootstrap with frontend auth by @chengyongru in #3658
fix(weixin): raise exceptions on message send failure to prevent silent loss by @chengyongru in #3659
feat(webui): polish chat UX and slash commands by @Re-bin in #3661
fix(dream): restore cursor with memory state by @Jefsky in #3660
refactor(logging): preserve tracebacks in remaining except blocks by @chengyongru in #3678
chore(ci): Enable full ruff -F (all F rules) checks and fix related errors by @yorkhellen in #3672
fix(api): remove enable_compression to restore real SSE streaming by @zhw415876999-prog in #3677
fix(onboard): allow empty strings and falsy values in input fields by @chengyongru in #3691
feat: add image generation tool and WebUI mode by @Re-bin in #3695
fix(memory): consolidate history hidden by replay window by @Re-bin in #3687
feat(webui): redesign settings and BYOK configuration by @Re-bin in #3703
fix(cli): sanitize surrogate code points before entering message bus by @chengyongru in #3697
fix(feishu): send all messages to topic when in thread by @yorkhellen in #3704
fix: log errors in silent exception handlers (matrix + weixin channels) by @vystartasv in #3664
fix(cli): handle retry-wait messages in interactive mode by @eugenechae in #3705
docs: add CLAUDE.md and .agent/ guides for AI contributors by @chengyongru in #3534
feat(webui): add BYOK web search settings by @Re-bin in #3709
fix(agent): persist _last_summary across restarts with used sentinel by @chengyongru in #3685
Revert "fix(agent): persist _last_summary across restarts with used sentinel" by @Re-bin in #3710
refactor: introduce AgentLoop.from_config() to centralize loop assembly by @chengyongru in #3708
refactor(loop): convert _process_message to functional state machine by @chengyongru in #3715
fix(utils): remove unreachable dead code in find_legal_message_start by @chengyongru in #3719
fix(agent): move archived summary into system prompt for KV cache stability by @chengyongru in #3711
feat: add NVIDIA NIM provider support by @barreler126 in #3707
feat(cli): make bot name and icon configurable (#3650) by @pixan-ai in #3730
fix(providers): wire MiMo to thinking_type to allow disabling reasoning (#3585) by @pixan-ai in #3734
fix(webui): shim crypto.randomUUID for non-secure contexts by @NearlCrews in #3733
fix(wecom): preserve real filename from SDK when payload omits name (#3737) by @04cb in #3751
refactor(tools): plugin architecture with self-describing tools by @chengyongru in #3729
fix(providers): set supports_max_completion_tokens for VolcEngine providers by @AlbertWang688 in #3738
feat(feishu): add topic_isolation config switch by @yorkhellen in #3747
feat(config): add ModelPresetConfig and runtime preset switching by @chengyongru in #3714
chore: remove dead code (vulture + coverage verified) by @chengyongru in #3755
fix(provider): preserve Bedrock tool config for history by @Re-bin in #3758
refactor(agent): remove ask_user tool by @chengyongru in #3757
fix(webui): default to new chat on load and preserve scroll on settings return by @XJPeng12 in #3759
test(agent): expand coverage and refactor test structure by @chengyongru in #3766
feat(reason): display model reasoning content during streaming by @Flinn-X in #3655
feat(runner): model failover with fallback_models by @chengyongru in #3756
fix(mcp): probe HTTP port before connecting to prevent event-loop crash by @chengyongru in #3740
fix(feishu): register no-op handlers for bot member events by @chengyongru in #3775
fix(agent): persist shortcut commands without polluting LLM context by @chengyongru in #3779
[security] fix(feishu): confine downloaded media filenames by @Hinotoi-agent in #3789
feat(pairing): chat-native DM sender approval by @chengyongru in #3774
fix(shell): support UNC paths in Windows path extraction by @JiajunBernoulli in #3764
fix: clear media_paths after successful voice transcription by @tamvicky in #3752
refactor(tools): remove GlobTool by @chengyongru in #3841
[security] fix(message): confine local media attachments by @Hinotoi-agent in #3842
perf(agent): append runtime context after user content for cache stability by @chengyongru in #3844
fix(web): back off Brave search rate limits by @boogieLing in #3840
fix(codex): stabilize prompt cache key by @boogieLing in #3793
feat(goal): /goal command & long-running tasks (long_task) by @Re-bin in #3788
fix(webui): remove eager markdown preload by @yorkhellen in #3782
feat: add Atomic Chat as OpenAI-compatible local LLM provider by @yanalialiuk in #3750
docs(contributing): warn that ruff format predates the codebase by @olgagaga in #3850
fix(agent): align LLM wall timeout with sustained goals for agents by @Re-bin in #3855
fix(exec): allow format in URL parameters while still blocking the format command by @Endeavour-Yuan in #3853
docs: update CLAUDE.md to reflect current codebase by @chengyongru in #3860
fix(agent): remove duplicate runtime context injection in mid-turn drain by @chengyongru in #3859
fix(providers): wire MiMo thinking control on gateway providers (#3845) by @olgagaga in #3851

New Contributors

@LZDQ made their first contribution in #3576
@halldorjanetzko made their first contribution in #3573
@Hinotoi-agent made their first contribution in #3569
@spinvettle made their first contribution in #3563
@tongtianli03-code made their first contribution in #3561
@hongshunanhai made their first contribution in #3577
@futurist made their first contribution in #3632
@kaseru made their first contribution in #3629
@Jefsky made their first contribution in #3660
@zhw415876999-prog made their first contribution in #3677
@vystartasv made their first contribution in #3664
@eugenechae made their first contribution in #3705
@barreler126 made their first contribution in #3707
@NearlCrews made their first contribution in #3733
@AlbertWang688 made their first contribution in #3738
@Flinn-X made their first contribution in #3655
@tamvicky made their first contribution in #3752
@yanalialiuk made their first contribution in #3750
@olgagaga made their first contribution in #3850
@Endeavour-Yuan made their first contribution in #3853

Full Changelog: v0.1.5.post3...v0.2.0