Changelog
All notable changes to ClaraVerse.
This project uses semantic versioning. Tags are cut as vMAJOR.MINOR.PATCH.
[0.3.0] — 2026-06-01
Project-scoped knowledge bases (RAG) wired through every surface
that touches an LLM. Same retrieval pipeline across Chat, Nexus
daemons, and Workflows — the differentiator that closes the biggest
remaining gap vs. OpenWebUI per the v0.2.0 retrospective.
RAG stack
- Qdrant + FastEmbed sidecars added to
docker-compose.ymland
docker-compose.production.yml. Fresh installs now boot the
Knowledge feature end-to-end with no extra setup. Embedding models
cached in a docker volume so restarts are fast after the first
~140 MB cold download. - Default stack: bge-small-en-v1.5 (384-dim cosine, dense) +
Qdrant/bm25 (sparse) + bge-reranker-base (cross-encoder). All
three swappable per-deployment viaEMBEDDINGS_*env vars. - One Qdrant collection per project (
kb_<project_oid>) so
per-project snapshots, deletes, and embedder choice are trivial. - Hybrid retrieval (dense + sparse with RRF fusion) runs by
default for fresh collections. Phase A's pure-dense fallback is
retained for pre-existing collections — Qdrant doesn't support
adding a vector kind retroactively, so older collections stay
dense-only until reingested. - Reranker on by default: cross-encoder rerank on top-50 → top-K
measurably better than vector order alone (verified live: same
query went 5.24 → 6.77 → 8.45 score progression through dense →
hybrid → hybrid+rerank). - Background warmup: embeddings sidecar pre-loads dense + sparse
models on boot in daemon threads so the FastAPI port binds
immediately and the first ingest doesn't pay the cold-start cost.
Three surfaces, one model
- Chat: multi-project chip picker above the input. Click
+ Knowledge, check projects, chips appear. Per-chat selection
persists to localStorage (survives reloads). When the user sends,
search_knowledgeis injected as a per-turn tool scoped to the
selected projects. Picker selection is the source of truth — drop
the chips, drop the tool. - Nexus: daemons spawned on a project task automatically get
search_knowledgewhen the project has indexed knowledge. The
Cortex classifier reads the project's knowledge state and biases
the daemon's task summary toward "Search the project knowledge
base for X" phrasing. The researcher quality gate accepts
search_knowledgecalls as a satisfier (alongsidesearch_web
/fetch_url/read_artifact). - Workflows: new
knowledge_searchblock in the agent builder.
Full settings panel: project multi-select pulled from REST,
templated query field,top_k(1-30), rerank toggle. Outputs
{chunks, count, elapsed_ms}for downstream LLM blocks.
Frontend additions
- New per-project Knowledge tab under each project in Nexus.
Drag-and-drop upload (PDF / MD / TXT / HTML, up to 50 MB),
file list with live ingest progress (polled every 3s while
anything is non-terminal), embeddings sidecar warmup banner that
shows only when files are actually queued/ingesting. - Chat sidebar's previously "Coming Soon" Projects entry now
active. Click navigates to Nexus for project management; tooltip
surfaces how many projects are currently attached to the chat. - New stores:
useChatKnowledgeStore(per-chat persisted picker
selection),useNexusStorepartial persist for live activity
panel survival across navigation.
Data model
- New Mongo collections:
nexus_knowledge_files,
nexus_knowledge_collections. Files are catalog only — actual
chunk text + vectors live in Qdrant. Per-collection record
tracks embedder fingerprint so we refuse to ingest a file when
the embedder's dim drifts from what the collection was built with. - New backend models:
KnowledgeFile,KnowledgeCollection,
KnowledgeSearchHit. - New
KnowledgeProjectIDs+InjectedToolsfields on
UserConnectionfor per-turn chat tool wiring. NexusOrchestrationStatecarriesProjectIDso a resumed
multi-daemon run re-attachessearch_knowledgecorrectly.
Backend services
- New
backend/internal/services/rag/package: parser
(PDF/MD/TXT/HTML with page + section preservation),
markdown- and code-fence-aware recursive chunker (1000/200),
Qdrant + FastEmbed HTTP clients, ingest worker (drains a Mongo
queue, batches of 64), search orchestrator (multi-project,
hybrid, RRF, dedupe, rerank). - New
RAGSearcherinterface inservices/(avoids import cycle
withservices/rag).NewRAGSearcherwires it into the
concrete service. BothCortexServiceandChatServiceaccept
it via setters. daemon_runner.go: per-runnerinjectedToolsmap for
context-bound tools that shouldn't go through the global
registry.executeTooldispatches injected tools first.tools/registry.go:GetMCPToolsnow filters by
Source == ToolSourceMCPLocalinstead of returning anything in
the userTools bucket — fixes a misroute where injected built-in
tools were getting routed through the MCP bridge.
Test harnesses
scripts/rag_e2e.sh: full Nexus path. Provisions a user,
creates a project, uploads a synthetic doc with a distinctive
marker, waits for ingest, runs REST search, fires a Nexus task
that should auto-usesearch_knowledge, asserts the marker
appears in the daemon's reply.scripts/rag_chat_e2e.py: same shape for the chat WebSocket
path. (Currently blocked by a fiber upgrade quirk in the
websockets python client; left in for follow-up debug. Chat
wiring is verified by build + boot log + the shared
RAGSearcherinterface backing both paths.)
Documentation
docs/RAG.md: full design doc, including the quality-lever
ranking (hybrid → rerank → contextual chunking → smart
chunking → citations → per-project embedder)..env.example: documents allQDRANT_URL,EMBEDDINGS_SERVICE_URL,
EMBEDDINGS_*_MODEL, andEMBEDDINGS_PRELOAD_RERANKERknobs.README.md: Knowledge bases row added to the feature matrix;
the single-containerdocker runinstall now flags that RAG
needs the Compose setup (no qdrant/embeddings sidecars in the
one-container image).
Deferred to next release
- Contextual chunk prefixing (Anthropic-style, ~35% retrieval
improvement). Toggleable flag preserved on
KnowledgeCollection.ContextualEnabledso this lands without a
schema migration. - Per-project embedder override in admin UI (backend already
storesEmbedderIDper collection; UI not yet built). - Reingest button on the file list for migrating pre-Phase-C
collections to hybrid.
Fixes for new surfaces (caught during dogfooding)
- Frontend
nexusService+knowledgeServicelist endpoints
coerce JSONnull→[](Go's nil-slice serialization). Saved
every consumer from a defensive?? []guard. - Embeddings sidecar pre-warms models in a daemon thread at boot,
so the "model warming up" banner doesn't get stuck on indefinitely. - Knowledge tab upload was reading
auth_tokenfrom localStorage,
but the app stores it asaccess_token. Switched to
authClient.getAccessToken()— single source of truth. - Chat picker checkboxes were unresponsive because the Zustand
selector subscribed tos.get(stable function ref). Switched
to subscribing tos.selections[chatKey] ?? KNOWLEDGE_EMPTY—
toggling now updates the chip row instantly.
[0.2.0] — 2026-05-31
A hardening release focused on making Nexus — the multi-agent orchestration
layer — production-grade. Sixteen fixes across the daemon pipeline, the
LLM transport layer, the Kanban UI lifecycle, and the multi-user safety
boundary. Plus a rebrand revert to the original ClaraVerse name and
rose-pink palette.
Nexus — orchestration durability
- Daemon quality gate. Before a daemon can claim "done," we verify it
actually did its job. Multi-daemon workers with downstream consumers
must have calledproduce_artifactat least once (without it,
downstream daemons re-derive everything from a thin summary).
Researcher-role daemons must have called at least one
information-gathering tool (search_web,fetch_url,read_artifact,
etc.) — finishing without searching means hallucinated output ~95% of
the time. Violations inject a corrective reprompt and continue the
loop, capped at 2 enforcement cycles so a stubborn model can't be
pinned forever. - Five knobs for long-running tasks. Bumped the orchestrator execution
timeout from 10 min → 30 min, daemon max iterations from 40 → 100, and
per-user daemon concurrency from 5 → 10. Added exponential backoff
(capped at 30 s) for transient LLM errors — 429s, 5xx, network blips
no longer kill the run. Added phase-summary recycling: every 25
iterations the daemon compacts its conversation to
[system, original task, phase summary]so 100-iteration runs don't
blow the context window even after aggressive tool-result trimming. - Fixed the "queued tasks never start" bug. Three independent root
causes were silently swallowing tasks: (1) pending daemons were being
dropped from the queue before slot acquisition succeeded; (2) a slot
exhaustion deadlock had no break-out — fixed with a stuck-pending
detector that marks the daemon failed and exits the loop; (3) zombie
daemons surviving a backend restart were never reaped, so they
occupied phantom slots forever — fixed with boot-time
CleanupStaleDaemonsandCleanupStaleTasks. - Artifact isolation per orchestration. The synthesis step used to
pull every artifact the user had ever produced into the final prompt,
causing stale Python code to leak into edge-computing answers. Now
usesListSince(parent.CreatedAt)to scope to artifacts produced
within this orchestration's lifetime. - Synthesis resilience. Added structured error publishing
(error_code,user_message,hint, internalerr) so the UI can
show actionable messages instead of raw stack traces. Added 30-min
ceilings on resume-from-mongo contexts.
Nexus — LLM transport (Bedrock OpenAI shim)
The Bedrock OpenAI-compatible endpoint is strict about request body
shape; four cooperating fixes closed a long-running 400 loop:
cache_controlremoval. Bedrock rejects Anthropic-style
cache_controlblocks on the OpenAI endpoint.- HTML-escape disabled on JSON encode (the default Go behavior
rewrites<>&to<>&, which broke the parser on
certain inputs). - 20 KB request-body guard with aggressive tool-result shrinking when
a single message body exceeds the threshold. - Tool call ID sanitization. Bedrock rejects the
functions.X:N-style IDs some providers emit; rewrites them to
call_<n>. parallel_tool_calls=falseforced — Bedrock 400s on assistant
messages containing multipletool_callsin a single turn.filterValidToolCallsdrops tool calls whoseargumentsJSON was
truncated mid-string by the model — echoing those back triggers
"Unterminated string at column N" 400s.- Anti-loop nudge appended 5 iterations before the cap, telling the
model to write its final answer instead of burning the budget on more
tool calls.
Multi-agent — what actually makes the multi-daemon path work
- Artifact handoff between daemons. Wired the artifact store + a
SubagentRunneradapter onto every daemon path, taught the system
prompt to useproduce_artifact/list_artifacts/read_artifact
for cross-daemon handoff (without this, downstream daemons only got a
thin summary of upstream output). - PDF auto-nudge. The context builder detects tasks containing
pdf/report/document and biases the prompt toward an artifact-producing
flow. - Classifier bias.
BuildClassificationPromptnow strongly favors
MULTI_DAEMONfor "X and Y"-shaped requests (research X and write Y
was previously misclassified as single-daemon). - Multi-daemon context. Each daemon now gets a terse
Position: daemon N of M / Handoff:block in its system prompt — the model
understands it has downstream consumers and uses the artifact tools.
Multi-user safety
- Audited isolation across the shared services. EventBus, task
store, daemon pool queries, artifact store, orchestration state,
WebSocket handler — all correctly scope byuserId/sessionId. - Closed the one gap that existed.
DaemonPool.Cancel(daemonID)had
no ownership check; any authenticated user with a daemon ObjectID
could cancel another user's daemon. AddedCancelForUser(ctx, userID, daemonID)that loads the daemon scoped to the user before
cancelling; routed the WebSocketcancel_daemonhandler through it. - Concurrent stress test (
scripts/nexus_multiuser_stress.sh):
provisions N test users, fires N tasks in parallel with unique marker
strings, asserts each user's result contains its OWN marker and none
of the others'. Validates the entire shared-service isolation under
load. 5/5 clean, ~36 s end-to-end concurrent run, zero cross-user
contamination.
Frontend — Nexus UI lifecycle
- Live activity panel survives navigation.
useNexusStorenow wraps
persistwithsessionStorage, partialized toconversation(last
200),daemons,classification,missedUpdates. Switching from
Chat → Nexus → Workflows → Nexus keeps the panel populated; hard
reload within the tab session also restores it. - REST rehydration on mount.
Nexus.tsxfetches active daemons from
GET /api/nexus/daemonson mount so even a cold reload (where
sessionStorage was empty) repopulates the panel headers instantly,
while the WebSocket streams live updates on top. - Task vanishing fix. Tasks now rehydrate from REST on connect (not
just from the WSsession_statemessage, which can lag if events
arrived while the WS was disconnected). Three trigger paths:
connect/projectId change, documentvisibilitychange, and mount. - Daemon panel.
expandedColumnsincludes 'done' by default;
extractFileLinksregex matches(api/)?files/<id>and renders green
download buttons inTaskDetailPanel. - Toast on completion.
task_completedevents now surface a success
toast with atruncateForToasthelper.
Backend — execution + headless API
/api/nexus/runand/api/nexus/run/syncendpoints. Headless REST
for firing a Nexus task: fire-and-poll vs. block-until-completion.
Both honor a 30-min ceiling./api/workflows/templatesendpoint. Workflow template gallery
backend with 5 built-in templates.- Backup/restore admin endpoints.
- Integration test suite for
nexus_orchestration_stateand
nexus_artifact_store. Caught absontag bug on
NexusArtifactSummary(size_bytes/created_atwere returning 0). CosineSimilarityprecision fix. The homemade sqrt with 4 Newton
iterations was imprecise; replaced withmath.Sqrt— caught by unit
test.sendUpdatepanic guard. Was crashing the whole backend on a
send on closed channel; now usesdefer recover().
Branding
- Rose pink restored. Design tokens reverted to the original
ClaraVerse palette:#e91e63accent with hover#f06292/ active
#c2185b, HSL340 82% 52%for shadcnprimary/ring, matching
glow shadows and gradients. The "Emerald Edition" was an experimental
detour. - Name restored. 70 files reverted from
DobbyAI→ClaraVerse
across user-facing copy, meta tags, comments, log lines, and config
defaults. URLsdobbyai.app → claraverse.app. TheDOBBY_DF/
<</DOBBY_DF>>protocol tokens between the Go runtime and the Python
runner were deliberately preserved — those are a wire-format
sentinel, not a brand string, and renaming them would silently break
DataFrame extraction.
Scripts
scripts/nexus_e2e.sh— single-task end-to-end tester against
/api/nexus/run/sync.scripts/nexus_multiuser_stress.sh— concurrent multi-user
isolation/contamination test.backend/cmd/mint_token— mints a JWT for the first mongo user
(used by both scripts).
Documentation
docs/SYSTEMS.md— ~600-line walkthrough of the Nexus pipeline:
Cortex classifier → DaemonPlan DAG → DaemonRunner goroutines →
synthesis. Reference for anyone debugging the multi-agent path.
For prior history (v0.1.x and earlier), see
GitHub Releases.