v0.51.44 — Release T
5-PR contributor batch + comprehensive test-suite network isolation.
This release fuses five community PRs and adds a hermetic test-suite network isolation layer that brings full pytest from 161s → 95s and eliminates the entire class of flaky failures caused by accidental outbound TLS handshakes.
What lands
| PR | Author | Class |
|---|---|---|
| #2048 | @Hinotoi-agent | [security] Workspace validation on session import
|
| #2052 | @franksong2702 | First-run onboarding guide (181 LOC docs) |
| #2053 | @franksong2702 | Worktree-backed session creation (Closes #1955) |
| #2055 | @franksong2702 | Adjacent duplicate assistant message dedup (Closes #2051) |
| #1970 | @dobby-d-elf | First-class LM Studio provider with live model discovery |
Highlights
[security] Workspace validation on session import (#2048)
Pre-fix, POST /api/session/import read workspace straight from the imported JSON without running it through resolve_trusted_workspace(). A crafted import with "workspace": "/" got persisted into the Session, after which /api/file?session_id=<sid>&path=etc/hosts resolved against / and served host files. The patch routes the imported value through the same resolver every other workspace-bearing endpoint already uses. Severity is highest on 0.0.0.0-bound / LAN-exposed deployments with password auth, where PR:L applies.
Worktree-backed sessions (#2053)
POST /api/session/new now accepts a worktree: true flag that calls the agent's _setup_worktree() helper to create an isolated git worktree at <repo>/.worktrees/hermes-XXXX, persists worktree_path / worktree_branch / worktree_repo_root / worktree_created_at on the WebUI Session, surfaces a "New conversation in worktree" action in the workspace menu, and shows a subtle sidebar worktree indicator. Empty worktree sessions stay visible in the sidebar. Cleanup lifecycle is deferred to a follow-up — needs explicit safeguards for active streams, terminals, dirty files, and unpushed commits.
The underlying Hermes Agent helper may add .worktrees/ to the repository .gitignore the first time a worktree is created for that repo. Operators will see a small uncommitted edit to .gitignore after their first worktree session.
LM Studio first-class support (#1970)
Live model discovery for LM Studio. get_available_models() calls hermes_cli.provider_model_ids("lmstudio") first, falling back to a direct GET <base_url>/models request when env vars haven't been injected yet (fixes the race where the profile's .env isn't loaded before the picker runs). The new _get_provider_base_url() helper looks for base_url in two locations — cfg["providers"]["lmstudio"]["base_url"] AND cfg["model"]["base_url"] when cfg["model"]["provider"] == "lmstudio" — so both config shapes work.
Adjacent duplicate assistant transcript dedup (#2055)
_merge_display_messages_after_agent_result() now skips adjacent duplicate assistant messages by merge identity (role + content + tool_call_id + json.dumps(tool_calls, sort_keys=True)). Some provider/result replay paths produced two copies of the same assistant bubble in the current delta. The guard is intentionally adjacent-only so two separate turns that happen to produce identical text remain visible.
First-run onboarding guide (#2052)
181 lines of docs/onboarding.md covering install paths, safe wizard re-runs with isolated HERMES_HOME / HERMES_WEBUI_STATE_DIR, provider groups, Docker / local-server Base URL rules (the most common Discord support question — localhost inside a container is not the host running LM Studio or Ollama), workspace setup, password step, files written, and issue-reporting diagnostics. Stale ~/.hermes/webui-mvp → ~/.hermes/webui correction in .env.example and the README env-var table.
Maintainer review fixes (3)
Three things landed via maintainer-driven fixes on the stage branch rather than gating the contributor PRs:
-
PR #1970 lmstudio regression — the new branch initially only looked at
providers.lmstudio.base_url, missing the historicalcfg.model.base_urlshape that real users have. 3 pre-existing tests broke. Plus the branch's outertry/exceptwas catchingImportErrorfromhermes_cli, silently skipping the urlopen fallback on CI environments withouthermes_cli. Both fixed. -
PR #2053 × PR #2041 state.db worktree recovery — silent data loss. Opus advisor caught this. The state.db sidecar reconciliation rebuilt sessions without their
worktree_*fields, so worktree-backed sessions disappeared from the sidebar after recovery. Fixed by propagatingworkspace,worktree_path,worktree_branch,worktree_repo_root,worktree_created_at, andmessage_countfrom the state.db row. -
Test suite was making real outbound TLS to Anthropic + Amazon. During pytest debugging,
ss -tnpshowed the test_server subprocess maintaining ESTAB sockets to[2607:6bc0::10]:443and3.173.21.63:443. Some SDK init path was triggering real outbound, adding 60+s of wall-time and creating a class of flaky failures.
Test-suite network isolation (the big one)
Two new layers enforce no-outbound-by-default:
- Pytest process (
tests/conftest.pymodule-level monkey-patch onsocket.create_connection+socket.socket.connect). Allowed: loopback / RFC1918 / link-local / RFC5737 TEST-NET-3 / RFC2606 reserved TLDs. Everything else raisesOSError("hermes test network isolation"). Tests that legitimately need real outbound opt back in via the newallow_outbound_networkfixture (zero current callers). - test_server subprocess (
server.py):HERMES_WEBUI_TEST_NETWORK_BLOCK=1env var (set by the test_server fixture on every spawn) activates an identical guard at the top ofserver.pyat import time, before anyapi/*module loads. Env var unset in production → no-op.
10 adversarial tests in tests/test_conftest_network_isolation.py prove the block fires for the exact destinations we observed leaking and the allow-list passes loopback / RFC1918 / link-local / reserved-TLDs through.
test_dns_resolution_failure refactored to mock socket.getaddrinfo raising gaierror instead of relying on real DNS for *.invalid. Hermetic now.
Tests
5,166 → 5,200+ collected, all green on Python 3.11/3.12/3.13. Full suite wall-time: 161s → 95s locally (the previously-leaking outbound TLS handshakes were the long tail).
Contributors
@Hinotoi-agent (×1, first contribution) · @franksong2702 (×3) · @dobby-d-elf (×1, first contribution) · @nesquena (5 maintainer review fixes across the stage)
Notes
-
The state.db × worktree recovery interaction (PR #2053 × PR #2041) is the second consecutive release where Opus advisor caught a real cross-PR data-loss bug that neither PR's individual test suite would have surfaced. Cross-PR adversarial review with grep-grounded prompts catches what unit tests miss when the failure mode lives at the seam between two features.
-
Hermetic test infrastructure is a foundational improvement — every future PR review benefits from the 95s baseline and the impossibility of flaky-because-of-outbound failures. Worth its weight.