github datalab-to/marker v1.4.0
LLM fixes; new benchmarks

latest releases: v1.10.2, v1.10.1, v1.10.0...
16 months ago

New benchmarks

Overall

Benchmark against llamaparse, docling, mathpix (see README for how to run benchmarks). Marker performs favorably against alternatives in speed, llm as judge scoring, and heuristic scoring.

image

Table

Benchmark tables against gemini flash:

image

Update gemini model

  • Use the new genai library
  • Update to gemini flash 2.0

Misc bugfixes

  • Fix bug with OCR heuristics not being aggressive enough
  • Fix bug with empty tables
  • Ensure references get passed through in llm processors

What's Changed

Full Changelog: v1.3.5...v1.4.0

Don't miss a new marker release

NewReleases is sending notifications on new releases.