Graph Database Architecture for Memory Associations
30x Query Performance, 97% Storage Reduction, Zero Breaking Changes
This release introduces a dedicated SQLite graph table with recursive CTEs for memory associations, providing dramatic performance improvements and storage efficiency while maintaining full backward compatibility.
Key Features
SQLite Graph Table with Recursive CTEs
- 30x faster association queries (150ms → 5ms for find_connected)
- 97% storage reduction per association (500 bytes → 50 bytes)
- Bidirectional graph traversal using indexed lookups
- Multi-hop queries with cycle prevention (<50ms for 3-hop)
Three Configurable Storage Modes
- memories_only: Legacy behavior (associations as Memory objects)
- dual_write: Transition mode (default) - writes to both memories + graph tables
- graph_only: Modern mode - graph table only, 97% storage savings
Zero Breaking Changes
- Default dual_write mode maintains existing behavior
- Gradual migration path for existing deployments
- Rollback support - can revert to memories_only at any time
Migration Scripts
Backfill Script
python scripts/maintenance/backfill_graph_table.py --dry-run # Preview
python scripts/maintenance/backfill_graph_table.py --apply # Execute- Migrates existing associations to graph table
- Safety checks: database locks, disk space, HTTP server warnings
- Transaction safety with rollback on errors
Cleanup Script
python scripts/maintenance/cleanup_association_memories.py --dry-run
python scripts/maintenance/cleanup_association_memories.py- Removes association memories after graph migration
- VACUUM operation reclaims ~2-3 MB storage
- Verifies graph table completeness before deletion
Performance Metrics
Real-World Deployment (1,449 associations):
| Query Type | Before | After | Improvement |
|---|---|---|---|
| Find Connected (1-hop) | 150ms | 5ms | 30x faster |
| Find Connected (3-hop) | 800ms | 25ms | 32x faster |
| Shortest Path | 1,200ms | 15ms | 80x faster |
| Get Subgraph (radius=2) | N/A | 10ms | New capability |
Storage Efficiency:
| Mode | Database Size | Per Association | Reduction |
|---|---|---|---|
| memories_only | 2.8 MB | 500 bytes | Baseline |
| dual_write | 2.88 MB | ~515 bytes | -3% (temporary) |
| graph_only | 144 KB | 50 bytes | 97% reduction |
GraphStorage API
New dedicated storage layer for graph operations:
# Store bidirectional association
await graph_storage.store_association(
source_hash, target_hash,
similarity=0.85,
connection_types=["semantic", "temporal"]
)
# Find connected memories (1-N hops)
connected = await graph_storage.find_connected(hash, max_distance=3)
# Find shortest path between memories
path = await graph_storage.shortest_path(hash1, hash2)
# Extract subgraph for visualization
subgraph = await graph_storage.get_subgraph(hash, radius=2)Configuration
# Set storage mode (environment variable)
export MCP_GRAPH_STORAGE_MODE=dual_write # Default
# Options: memories_only | dual_write | graph_only
# Migration path for existing users:
# 1. Upgrade to v8.51.0 (dual_write mode default)
# 2. Run backfill script
# 3. Switch to graph_only mode
# 4. Optional: Run cleanup scriptTest Coverage
26 new tests passing (0.25s execution time):
- 22 GraphStorage tests - store, find_connected, shortest_path, get_subgraph
- 4 mode switching tests - config validation, basic operations
- 7 scaffolded tests - Phase 2 consolidator integration
- 90-95% coverage for GraphStorage class
Technical Implementation
Database Schema (008_add_graph_table.sql):
CREATE TABLE memory_graph (
source_hash TEXT NOT NULL,
target_hash TEXT NOT NULL,
similarity REAL,
connection_types TEXT, -- JSON array
metadata TEXT, -- JSON
created_at REAL,
PRIMARY KEY (source_hash, target_hash)
);Recursive CTE Example (BFS traversal):
WITH RECURSIVE connected_memories(hash, distance, path) AS (
SELECT ?, 0, ?
UNION ALL
SELECT mg.target_hash, cm.distance + 1, cm.path || ',' || mg.target_hash
FROM connected_memories cm
JOIN memory_graph mg ON cm.hash = mg.source_hash
WHERE cm.distance < ? AND instr(cm.path, mg.target_hash) = 0
)
SELECT DISTINCT hash, distance FROM connected_memories WHERE distance > 0;Files Changed
Created (7 files):
src/mcp_memory_service/storage/graph.py(383 lines) - GraphStorage classsrc/mcp_memory_service/storage/migrations/008_add_graph_table.sql(18 lines)scripts/maintenance/backfill_graph_table.py(286 lines)scripts/maintenance/cleanup_association_memories.py(542 lines)tests/storage/test_graph_storage.py(518 lines)tests/storage/conftest.py(142 lines)tests/consolidation/test_graph_modes.py(263 lines)
Modified (3 files):
src/mcp_memory_service/config.py(+24 lines).env.example(+24 lines)src/mcp_memory_service/consolidation/consolidator.py(+130 lines)
Total: 2,652 insertions(+), 26 deletions(-)
Real-World Validation
3-Day Production Testing (Dec 14-16, 2025):
- 2,870 associations created and stored correctly in graph table
- Zero crashes, errors, or performance issues
- Manual consolidation API verified working (0.03s for 43 memories)
- Dual write mode safe - no data loss across both tables
Documentation
- Architecture Design:
docs/architecture/graph-database-design.md - Migration Guide:
docs/migration/graph-migration-guide.md - Wiki Page:
docs/wiki-Graph-Database-Architecture.md - Configuration: Updated
.env.examplewith comprehensive examples
Future Enhancements
Phase 2 (v8.52.0):
- REST API endpoints for graph queries
- Graph visualization in web dashboard
- Complete consolidator mode switching
Phase 3 (v9.0+):
- rustworkx integration for advanced analytics
- Pattern matching (Cypher-like queries)
- Temporal graph queries
Related Issues
- Closes #279 - Graph Database Architecture for Memory Associations
- Related to #268 - Memory Quality System (uses association counts)
- Related to consolidation system (creates 343 associations in single run)
Full Details: CHANGELOG.md