github topoteretes/cognee v1.2.2
v1.2.2 — Truth Subspace & Retrieval Improvements

8 hours ago

v1.2.2 — Truth Subspace & Retrieval Improvements

Release Date: 2026-06-26
Changes: v1.2.2.dev0 → main


Summary

This release introduces a new "truth subspace": a compact index built from distilled, accepted session learnings that helps rerank search results and weight feedback. It also activates an opt-in learned feedback signal for retrieval, fixes LanceDB S3 issues, and adds demos and tests to showcase the new reranking workflow.

Highlights

  • New truth subspace: builds centroids and slots from distilled session lessons to rerank and align search results with what the system 'learned'.
  • Truth-subspace reranking + feedback activation: retrieval can now use a learned feedback signal to improve result relevance (opt-in by configuration).
  • Improvements to integrity and filtering: signatures now use sha256 and centroid session filtering has been tightened to reduce noise in truth anchors.
  • Operational changes: new opt-in build flag on the improve API, dataset context tracking, and a fix for LanceDB when used with S3.
  • Examples and tests: demos and unit tests added to help you try and validate the new truth-subspace workflow.

Breaking Changes

  • None. All new behavior is opt-in by default. The default feedback influence remains 0.0 unless you set DEFAULT_FEEDBACK_INFLUENCE or pass a non-zero feedback_influence in API calls.

New Features

  • Truth subspace builder: A new module that compiles distilled session learnings (lessons you accepted during conversations) into a compact index made of centroids (topic anchors) and slots (grouped examples). What it does: creates a small, structured representation of "what the system has learned" from sessions. Why it matters: using these centroids and slots to rerank search results makes answers align better with previously confirmed information.
  • Centroid-slot truth weighting (MVP): A simple weighting scheme that boosts or penalizes search candidates based on how well they align with truth centroids and their example slots. What it does: adjusts ranking scores using the truth subspace. Why it matters: improves relevance by prioritizing answers consistent with prior accepted lessons.
  • Truth-subspace reranking + feedback activation: Retrieval now supports a reranking stage that uses the truth subspace and an activated learned feedback signal (a numeric weight that represents how strongly session lessons should influence search). What it does: reorders search results to favor items aligned with your distilled truths; the feedback signal can be tuned. Why it matters: yields more context-aware and consistent answers across sessions.
  • Build-truth-subspace option in the Improve API: The /improve endpoint gained an opt-in flag (build_truth_subspace) that runs the truth subspace build after session distillation and before enrichment. What it does: lets you create the truth subspace as part of your existing enrichment pipeline. Why it matters: convenient integration — you can update the truth index automatically when improving a dataset.
  • New demos and examples: Added interactive demos and example scripts (examples/demos/truth_centroid_slots_demo.py and examples/python/truth_subspace_reranking_demo.py) showing how to build and use the truth subspace. What it does: provides runnable code to try the reranking workflow. Why it matters: makes it easier to experiment and verify behavior in your data.

Improvements

  • Configurable default feedback influence: Introduced a new base configuration setting (DEFAULT_FEEDBACK_INFLUENCE env var) so the learned feedback signal can be enabled or tuned without changing code. What it does: controls the default blend weight applied to the learned feedback during graph search. Why it matters: you can opt into or adjust feedback influence globally or per-call for smoother rollouts.
  • Safer truth signatures: Truth entries now use sha256 hashing for signatures. What it does: improves robustness and consistency of truth identity. Why it matters: reduces the chance of signature collisions and makes truth verification more reliable.
  • Tighter centroid session filtering: The process that selects which sessions contribute to truth centroids now filters more strictly. What it does: reduces noisy or irrelevant session contributions to centroids. Why it matters: yields cleaner, more reliable centroids so reranking reflects meaningful lessons.
  • Dataset context tracking: Added a request-local current_dataset_id context variable so background tasks and retrieval routines know which dataset is in scope. What it does: provides per-request dataset awareness through the call stack. Why it matters: prevents accidental cross-dataset mixing and makes downstream code simpler to write.
  • Improved README and plugin instructions: Claude Code plugin documentation updated to clarify installation and lifecycle behavior (what is captured and when session memory syncs to the graph).

Performance

  • More relevant reranking reduces wasted work upstream: Precomputed centroids and slots let the system apply a fast rerank step instead of expensive re-evaluation for every candidate, resulting in quicker, more focused results during retrieval.
  • Unit-tested components for truth_subspace: A large test suite around the new truth_subspace modules was added, improving confidence and preventing regressions that could impact search quality and throughput.

Security

  • Truth signature hashing upgraded to sha256: This strengthens the integrity of truth identifiers and makes signature collisions far less likely. This is an internal hardening to ensure stored lesson identities are consistent and tamper-resistant.

Bug Fixes

  • Fix: LanceDB S3 usage resolved. What it fixed: an issue that prevented using LanceDB vector storage backed by S3. Why it matters: users storing vectors in S3-backed LanceDB instances can now use that setup reliably.
  • Various doc fixes and small API hygiene updates (readme clarity, test additions).

Technical Changes

  • New truth_subspace package: many new modules (align.py, build.py, centroids.py, models.py, constants.py) and accompanying unit tests implementing centroid and slot building, alignment, and a simple truth weighting/reranking flow.
  • API surface updates: /improve gained a build_truth_subspace boolean parameter; search and recall endpoints now read default feedback influence from base config instead of a hard-coded 0.0.
  • Graph DB interface extension: new abstract methods get_node_truth_state and set_node_truth_state added to the graph DB interface so adapters can persist per-node truth alignment state.
  • Database adapters and integrations: Ladybug graph adapter added/updated; LanceDB adapter fixed for S3; vector/graph interface updates to support truth state and retrieval changes.
  • Context manager changes: DatabaseContextManager now sets and resets a per-request dataset token (current_dataset_id) so dataset identity is propagated during async operations.
  • Tests and examples: Many new unit tests for truth_subspace and retrieval integration and example scripts/demos to exercise the new features.

Compatibility

Component Supported / Required
Python >=3.10,<3.15
pydantic >=2.10.5
litellm >=1.83.7
fastapi >=0.116.2,<1.0.0
sqlalchemy >=2.0.39,<3.0.0
lancedb >=0.24.3,<1.0.0
ladybug >=0.16.0,<0.18

— The Cognee Team · 2026-06-26

Don't miss a new cognee release

NewReleases is sending notifications on new releases.