github kreuzberg-dev/kreuzberg v3.14.0

latest releases: v4.0.0-rc.10, packages/go/v4/v4.0.0-rc.10, v4.0.0-rc.9...
3 months ago

🚀 Major Enhancements

API Server Improvements

  • Increased file upload limit to 1GB - Resolves 413 errors for large document processing
  • Comprehensive OpenAPI documentation with Swagger UI, ReDoc, and multiple interface options
  • Type-safe API responses using TypedDict instead of generic dict[str, Any]
  • Enhanced error handling with detailed debugging support via DEBUG environment variable

DPI Configuration System

  • Automatic image size optimization prevents "Image too large" OCR failures
  • Configurable DPI settings with intelligent auto-adjustment for large documents
  • Performance vs quality trade-offs with customizable target DPI, min/max bounds
  • Seamless integration across all OCR backends (Tesseract, EasyOCR, PaddleOCR)

Infrastructure & Testing

  • Comprehensive test reorganization - Tests now organized in logical folders (core, features, integration)
  • Enhanced CI pipeline with improved coverage combining and error handling
  • New DPI integration tests ensuring robust image processing across all document types
  • Quality assurance improvements with expanded regression testing

Documentation Updates

  • Updated extraction configuration docs with DPI system examples and best practices
  • Enhanced OCR configuration guide with performance optimization recommendations
  • Complete API server documentation including OpenAPI endpoint references
  • User-focused approach emphasizing practical usage over implementation details

📋 Technical Details

New Configuration Options

config = ExtractionConfig(
    target_dpi=150,              # Optimal balance of quality and performance
    max_image_dimension=25000,   # Prevent memory issues with large documents
    auto_adjust_dpi=True,        # Intelligent scaling for oversized images
    min_dpi=72,                  # Minimum readable resolution
    max_dpi=600                  # Maximum before diminishing returns
)

API Improvements

  • 1GB file upload support via request_max_body_size configuration
  • OpenAPI documentation available at multiple endpoints (/schema/swagger, /schema/redoc, etc.)
  • Structured error responses with context for debugging

🔧 Breaking Changes

None - all changes are backward compatible.

🐛 Bug Fixes

  • Fixed CI coverage combining issues with lcov command syntax
  • Resolved pre-commit formatting conflicts
  • Eliminated "Image too large" errors across all OCR backends

Full Changelog: v3.13.4...v3.14.0

Don't miss a new kreuzberg release

NewReleases is sending notifications on new releases.