New features
Basic claude-mem Docker container (docker/claude-mem/)
A ready-to-run container for ad-hoc claude-mem testing with zero local setup beyond Docker.
FROM node:20; layers pinned Bun (1.3.12) + uv (0.11.7) + the built plugin- Non-root
nodeuser so--permission-mode bypassPermissionsworks headlessly build.sh,run.sh(auto-extracts OAuth from macOS Keychain or~/.claude/.credentials.json, falls back toANTHROPIC_API_KEY),entrypoint.sh- Persistent
.claude-mem/mount so the observations DB survives container exit
Validated end-to-end: PostToolUse hook → queue → worker SDK call under subscription OAuth → <observation> XML → observations table → Chroma sync.
SWE-bench evaluation harness (evals/swebench/)
Two-container split (our agent image + the upstream SWE-bench harness) for measuring claude-mem's effect on resolve rate.
Dockerfile.agent→claude-mem/swebench-agent:latest(same non-root, version-pinned approach)run-instance.sh— two-turn ingest/fix protocol per instance; shallow clone atbase_commitwith full-clone fallbackrun-batch.py— parallel orchestrator with OAuth extraction, per-container naming, timeout enforcement + force-cleanup,--overwriteguard against silent truncation of partial resultseval.sh— wrapspython -m swebench.harness.run_evaluationsummarize.py— aggregates per-instance reportssmoke-test.sh— one-instance smoke test
Fixes / hardening (from PR review)
chmod 600on extracted OAuth creds files- Grouped
{ chmod || true; }so bash precedence can't mask failedcurl|shinstalls - macOS creds: Keychain-first with file fallback for migrated / older setups
smoke-test.shTIMEOUTnow actually enforced viatimeout/gtimeoutplusdocker rm -fon exit 124- Container naming + force-cleanup in
run-batch.pytimeout handler prevents orphan containers - Fixed stdin-redirection collision in the consolidated
smoke-test.shJSON parser - Drop
execinrun.shso the EXIT trap fires and cleans the temp creds file
PR: #2076