๐ฏ Backend-Specific Content Length Limits with Intelligent Auto-Splitting
โจ Major Feature: Intelligent Content Length Management
Prevents embedding model failures by enforcing backend-specific limits and automatically splitting large content with boundary preservation.
๐ Key Features
Backend-Aware Content Limits
- Cloudflare: 800 characters (BGE-base-en-v1.5 model 512 token limit)
- ChromaDB: 1500 characters (all-MiniLM-L6-v2 model 384 token limit)
- SQLite-vec: Unlimited (local storage)
- Hybrid: 800 characters (constrained by Cloudflare secondary storage)
Intelligent Boundary-Preserving Splitting
- Respects natural boundaries: paragraphs โ sentences โ words โ characters
- Configurable 50-character overlap for context continuity
- LLM-friendly tool descriptions inform about limits upfront
- Transparent to end users - automatic background processing
Comprehensive Configuration System
- Environment variables for all limits (
MCP_*_MAX_CONTENT_LENGTH) - Toggle auto-split with
MCP_ENABLE_AUTO_SPLIT(default: True) - Configurable overlap:
MCP_CONTENT_SPLIT_OVERLAP(default: 50) - Boundary preservation:
MCP_CONTENT_PRESERVE_BOUNDARIES(default: True)
๐๏ธ Technical Implementation
New Content Splitter Utility (content_splitter.py)
- Priority-based split points (paragraphs โ sentences โ words)
- Validation helpers (
estimate_chunks_needed,validate_chunk_lengths) - Comprehensive edge case handling
Storage Backend Enhancements
- Abstract base class properties:
max_content_length,supports_chunking - All backends updated with appropriate limits and chunking support
- Batch storage operations with concurrent processing (asyncio.gather)
Enhanced MCP Server Tool
store_memorytool with automatic transparent splitting- Chunk metadata tracking:
is_chunk,chunk_index,total_chunks - Enhanced return types with TypedDict for better type safety
- Automatic
chunk:N/Mtags for easy retrieval
โก Performance Optimizations
- Concurrent batch operations via asyncio.gather
- Accurate chunk estimation accounting for overlap
- Minimal overhead for content within limits (O(1) property check)
- Efficient single-pass splitting with smart boundary detection
๐งช Comprehensive Testing
- 20+ test cases in
test_content_splitting.py - Coverage: basic splitting, boundary preservation, overlap validation
- Backend limit verification for all 4 storage backends
- Edge case testing: empty content, exact lengths, invalid overlaps
๐ Documentation & Developer Experience
- Clear CHANGELOG.md entry with technical details
- Enhanced tool docstrings visible to LLMs
- Comprehensive inline documentation
- Validation and startup logging
๐ Issues Resolved
- Fixes: First memory store attempt (1,570 chars) exceeded Cloudflare's BGE model limit
- Prevents: Embedding failures across all storage backends
- Eliminates: Silent content truncation or rejection
๐ Additional Context
- PR: #143 (5 rounds of Gemini Code Assistant review)
- Feature Branch:
feat/content-length-limits-with-splitting - Design: Conservative limits with buffer for tokenization variance
- Backward Compatible: No breaking changes to existing functionality
๐ฆ Full Changelog
See CHANGELOG.md for complete details.
๐ค Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com