🧠 Memory Hook Quality Improvements
This release addresses three interconnected issues to improve memory capture quality, eliminate duplicates, and optimize memory budget allocation.
Issues Resolved: #390, #391, #392
🎯 Key Features
1. Semantic Deduplication
Prevents storing semantically similar content within a configurable time window:
- KNN Cosine Similarity: Uses sqlite-vec's optimized vector search (85% threshold)
- 24-Hour Window: Catches cross-hook duplicates (PostToolUse + SessionEnd reformulations)
- Configurable: Three environment variables for fine-tuning
- Fast: <100ms overhead per storage operation
- Smart Errors: Returns descriptive messages with similar memory hash
Configuration:
MCP_SEMANTIC_DEDUP_ENABLED=true # Enable/disable (default: true)
MCP_SEMANTIC_DEDUP_TIME_WINDOW_HOURS=24 # Time window (default: 24)
MCP_SEMANTIC_DEDUP_THRESHOLD=0.85 # Similarity threshold (default: 0.85)2. Tag Case-Normalization
All tags stored lowercase with case-insensitive deduplication:
- Eliminates Duplicates:
["Tag", "tag", "TAG"]→["tag"] - Applied Everywhere: Parameter tags, metadata tags, hook-generated tags
- Backward Compatible: Existing mixed-case tags unchanged, searches already case-insensitive
- Consistent: JavaScript hooks updated for uniform case-normalization
3. Memory Budget Optimization
Increased from 8 to 14 slots with reserved minimums:
- 75% Capacity Increase: 8 → 14 slots for more comprehensive context
- Reserved Tag Slots: Minimum 3 slots guaranteed for tag-based retrieval
- Smart Allocation:
- Phase 0 (Git): Up to 3 slots (adaptive)
- Phase 1 (Recent): ~60% of remaining slots
- Phase 2 (Tags): At least 3 slots, more if available
- Prevents Crowding: Semantic search can't dominate curated memories
4. Enhanced Content Truncation
Multi-delimiter support for natural sentence boundaries:
- 9-10 Delimiter Types:
.!?.\n!\n?\n.\t;\n\n\n - 70% Threshold: More flexible than previous 80% (better natural breaks)
- Consistent Application: Applied across auto-capture-patterns.js and context-formatter.js
- No Mid-Sentence Cuts: Eliminates awkward breaks at colons/commas
📊 Testing
17 New Tests Added:
- 6 semantic deduplication tests (time windows, config, edge cases)
- 11 tag normalization tests (unit + integration scenarios)
- All tests passing: 99/99 ✅
- Zero regressions
Test Coverage:
- Semantic dedup: Time windows, threshold validation, disabled mode, similar/different content
- Tag normalization: Parameter tags, metadata tags, hook tags, mixed sources, case variations
- Test environment: Semantic dedup disabled during tests for isolation
🔧 Improvements
Hook Deduplication:
- Lowered Jaccard threshold from 80% to 65% in context-formatter.js
- Catches more cross-hook reformulations (55-70% similarity range)
- Better balance between duplicate detection and legitimate variations
Documentation:
- Comprehensive implementation plan in
fix-plan-issues-390-391-392.md - Updated
.env.examplewith semantic deduplication configuration - Created
TEST_ADDITIONS_SUMMARY.mddocumenting all test scenarios - Session-start.js configuration documentation
⚡ Performance
- Semantic Dedup: <100ms overhead per storage operation
- KNN Search: Leverages sqlite-vec's optimized cosine distance calculations
- No Retrieval Impact: Hook execution time unchanged (<10s total)
- Memory Efficient: Efficient vector operations with indexed search
🔄 Backward Compatibility
- Zero Breaking Changes: All changes are backward compatible
- Existing Tags: Mixed-case tags remain unchanged in database
- Searches: Already case-insensitive, no behavior change
- Semantic Dedup: Can be disabled via
MCP_SEMANTIC_DEDUP_ENABLED=false - Test Isolation: Semantic dedup automatically disabled during pytest runs
📝 Migration Notes
No migration required! This release is 100% backward compatible.
Optional Configuration:
If you want to adjust semantic deduplication behavior, add to your .env:
# Default values (can be customized)
MCP_SEMANTIC_DEDUP_ENABLED=true
MCP_SEMANTIC_DEDUP_TIME_WINDOW_HOURS=24
MCP_SEMANTIC_DEDUP_THRESHOLD=0.85🚀 Installation
PyPI (recommended):
pip install --upgrade mcp-memory-serviceFrom source:
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
git checkout v10.4.0
pip install -e .Docker:
docker pull ghcr.io/doobidoo/mcp-memory-service:v10.4.0🙏 Contributors
Special thanks to everyone who reported issues, provided feedback, and contributed to this release!
Full Changelog: v10.3.0...v10.4.0