HKUDS/nanobot v0.1.5.post3 on GitHub

🐈 nanobot v0.1.5.post3 is here 🎉 — 57 PRs merged, 12 new contributors. The agent learned to talk in threads.

If v0.1.5.post2 was about reach and polish, v0.1.5.post3 is about conversations becoming first-class citizens of their platform. Feishu group topics get isolated sessions. Discord threads inherit parent allowlists and keep their own context. Telegram can render inline keyboard choices. MSTeams prunes stale conversation references so outbound messages stop failing silently. And across all channels, sendProgress and sendToolHints can now be overridden per channel — quiet the noisy ones, keep the verbose ones. Underneath, DeepSeek-V4 is supported end to end: thinking mode and legacy session compatibility ship together, with follow-up fixes for incomplete reasoning history and non-string message content so long threads stay stable. A new ask_user tool lets the agent pause and ask you to choose mid-task. Olostep and Hugging Face joined the provider lineup, and a pair of timeout env vars (NANOBOT_LLM_TIMEOUT_S and NANOBOT_OPENAI_COMPAT_TIMEOUT_S) keep hung requests from holding your session hostage. The WebUI continued to evolve — image uploads, video rendering, ask-user choices, model settings — but remains source-preview only, not bundled into the wheel.

Highlights

Threads everywhere — Feishu, Discord, Slack, MSTeams — Each channel grew up this release. Feishu group topics now isolate sessions so messages in one topic don't leak into another; streaming cards and tool hints follow the original topic. Discord threads inherit their parent channel's allowChannels and get session isolation, which also means slash commands respect the allowlist. Slack stopped losing thread context on proactive replies. MSTeams conversation references gained TTL-based pruning (refTtlDays), auto-cleanup for Web Chat refs, and a touch interval to keep active refs alive. The theme is consistent: conversations belong to their thread, not to the channel at large. (#3449, #3397, #3440, #3462, #3475, #3447, #3487)
Per-channel progress and interaction controls — sendProgress and sendToolHints used to be global on-or-off switches. Now you can place them inside any individual channel config to override the global default — keep Telegram quiet while WebSocket stays verbose. The agent also learned to ask users structured questions mid-task via the new ask_user tool: in WebUI these render as buttons, in other channels they fall back to text. Telegram got inline keyboards (inline_keyboards: true) for rendering message tool button choices. The /history [n] command lets you review recent messages without scrolling. (#3487, #2791, #3398, #3454, #3466)
DeepSeek-V4 and the wider provider stack — This release makes DeepSeek-V4 a first-class target: thinking mode and legacy session compatibility land in one go (#3420), so you can point the agent at V4 without abandoning older conversations. Real-world transcripts exposed two gaps that got dedicated follow-ups — truncated or incomplete reasoning history (#3453) and heterogeneous (non-string) message content (#3458) — so tool-heavy sessions don’t fall over mid-run. On top of that, Hugging Face Inference Providers arrived as a first-class provider (#3496), Olostep joined web search (#3505), OpenAI-compatible endpoints gained extraBody for vLLM guided decoding and friends (#3491), and the timeout pair NANOBOT_LLM_TIMEOUT_S / NANOBOT_OPENAI_COMPAT_TIMEOUT_S split outer turn limits from inner HTTP bounds (#3428, #3478). GitHub Copilot routes GPT-5 and o-series models correctly (#3380); Gemini routing picks up reasoning_effort="none" and Gemma (#3515).
Memory and session hardening — consolidationRatio (0.1–0.95) lets you tune how aggressively token-triggered consolidation compresses context. maxMessages (default 120) caps the replay window without touching persistence. History.jsonl gained atomic writes with fsync and directory sync, closing the last data-loss window on unexpected shutdowns. A raw_archive bloat path and several stuck-consolidation edges were sealed. Sessions now fsync on graceful shutdown. The result: memory is both more tunable and more durable. (#3285, #3482, #3508, #3369, #3412, #3415, #3459)
macOS LaunchAgent and deployment — A new docs/deployment.md section walks through deploying nanobot gateway as a macOS LaunchAgent — plist, launchctl bootstrap/enable/kickstart, log paths, and the inevitable port-conflict gotcha when you forget to stop a manual gateway. Useful for anyone who wants the agent online at login without keeping a terminal open. (#3441)
Security and reliability fixes — A shell injection vector via path_append on non-Windows platforms was closed. Workspace directory violations now stop the agent loop instead of logging a warning. resolve_config_env_vars stopped stripping excluded fields. Anthropic image_url blocks inside tool_result content get converted correctly. MCP capability names are sanitized for model API compatibility. Windows MCP stdio launchers avoid WinError 193. Structured tool-event payloads give channels richer progress data. Document parsers lazy-import to cut cold-start time. Twenty-odd smaller fixes across providers, channels, and the agent loop round out the release. (#3366, #3493, #3383, #3387, #3470, #3379, #3399, #3423)

Community

Heartfelt thanks to everyone who shipped v0.1.5.post3 — 57 PRs, 27 contributors, and a huge welcome to 12 first-time contributors. Every review, patch, and bug report helped; this release is a shared win. 🎉

What's Changed

feat(transcription): add language parameter for Whisper STT by @chengyongru in #3375
fix: normalize DashScope reasoning_effort (minimal vs minimum) by @hlgone in #3374
fix(cli): respect sys.stdout.isatty() in commands.py by @knightconnorp in #3370
fix(session): fsync sessions on graceful shutdown to prevent data loss by @hussein1362 in #3369
fix(providers): support GPT-5 models on GitHub Copilot backend by @gongpx20069 in #3380
fix(mcp): avoid WinError 193 for Windows stdio launchers by @lahuman in #3379
fix(agent): prevent duplicate responses when sub-agents complete concurrently by @chengyongru in #3385
fix(anthropic): convert image_url blocks inside tool_result content by @tetratorus in #3387
fix(config): preserve excluded fields in resolve_config_env_vars by @saimonventura in #3383
feat(webui): image attachments — composer + signed media pipeline by @Re-bin in #3393
feat(telegram): add inline keyboard buttons by @gthieleb in #3398
feat(agent): emit structured tool-event payloads via on_progress by @pblocz in #3399
fix(agent): prevent history.jsonl bloat from raw_archive and stuck consolidation by @chengyongru in #3412
fix(agent): bound remaining memory/history pollution paths by @Re-bin in #3415
fix(provider): support DeepSeek V4 thinking mode and legacy session compatibility by @Re-bin in #3420
fix(anthropic): omit temperature for opus-4-7 (#3417) by @04cb in #3418
perf(document): lazy-import heavy document parsers by @mvanhorn in #3423
feat(channels): add video support for Telegram and WebSocket by @Re-bin in #3429
feat(webui): render video media attachments by @Re-bin in #3430
fix(agent): add LLM request timeout to prevent session lock starvation by @yorkhellen in #3428
docs: add macOS LaunchAgent setup for gateway by @choiking in #3441
feat(agent): add ask user tool by @lzmjlrt in #2791
fix(msteams): send threaded replies via replyToId by @chengyongru in #3447
feat(feishu): thread-scoped sessions, reply_in_thread, non-blocking reaction by @chengyongru in #3449
fix(providers): disable HTTP keepalive for local/LAN endpoints by @hussein1362 in #3444
fix(heartbeat): inject delivered messages into channel session for reply continuity by @hussein1362 in #3391
fix(provider): gate reasoning-to-content fallback behind spec flag by @chengyongru in #3446
feat(memory): make consolidate ratio configurable by @chengyongru in #3285
fix(security): prevent shell injection via path_append on non-Windows platforms by @yorkhellen in #3366
fix(provider): handle incomplete DeepSeek reasoning history by @Re-bin in #3453
feat(webui): add ask-user choices and model settings by @Re-bin in #3454
fix(slack): preserve thread context for proactive replies by @Re-bin in #3462
fix(agent): expose session timestamps in model context by @Re-bin in #3463
fix(mcp): sanitize MCP capability names for model API compatibility by @chengyongru in #3470
fix(slack): polish threaded replies and proactive delivery by @Re-bin in #3475
fix(agent): resolve relative media paths in MessageTool by @chengyongru in #3471
fix(agent): subagent announces from threaded callers route to channel session, not the originating thread by @mt-huerta in #3465
fix(provider): normalize DeepSeek non-string message content by @boogieLing in #3458
修复：自动清理未支持或已过期的 MSTeams 会话引用，避免通知发送失败 Fixes: #3433 by @zhuzhh in #3440
feat(session): enforce replay/file-cap invariants for history lifecycle by @boogieLing in #3459
fix(discord): full thread support with session isolation and allowlist enforcement by @Lbin91 in #3397
fix(provider): bound OpenAI-compatible request timeouts by @boogieLing in #3478
fix(heartbeat): prevent internal reasoning leaks and finalization fallback in delivery by @hussein1362 in #3389
feat(command): add /history command to review recent session messages by @LeoFYH in #3466
fix(codex): stream progress deltas to channels by @boogieLing in #3480
feat(config): wire max_messages into session history replay by @hussein1362 in #3482
feat(providers): add Hugging Face inference provider by @chengyongru in #3496
The agent loop should be stopped when workspace directory restrictions are violated by @lvqiushi in #3493
fix(channels): send telegram attachments with named file path by @Seym0n in #3489
feat(web-tools): Improve to allow bypassing Cloudflare captchas by @Mizarka in #3382
feat(providers): add extra_body config for OpenAI-compatible endpoints by @hussein1362 in #3491
fix(feishu): Fix done emoji addition and on-it emoji removal before task actually ends by @BarclayII in #3502
feat(web_search): add olostep provider by @chengyongru in #3505
Fix reasoning_effort="none" handling and add gemma to Gemini routing by @masterlyj in #3515
fix: sanitize Matrix user_id for Windows-safe store file names by @JiajunBernoulli in #3510
feat(channels): support per-channel progress controls by @boogieLing in #3487
fix(memory): ensure atomic write for history.jsonl by @yorkhellen in #3508

New Contributors

@knightconnorp made their first contribution in #3370
@tetratorus made their first contribution in #3387
@saimonventura made their first contribution in #3383
@gthieleb made their first contribution in #3398
@pblocz made their first contribution in #3399
@mvanhorn made their first contribution in #3423
@choiking made their first contribution in #3441
@mt-huerta made their first contribution in #3465
@zhuzhh made their first contribution in #3440
@Seym0n made their first contribution in #3489
@Mizarka made their first contribution in #3382
@BarclayII made their first contribution in #3502

Full Changelog: v0.1.5.post2...v0.1.5.post3