github datalab-to/surya v0.14.0
Surya OCR 3

latest releases: v0.17.1, v0.17.0, v0.16.7...
10 months ago

Surya OCR version 3!

The latest version of Surya OCR has a new architecture, and is trained on significantly more data than before.

Some notable features:

  • 90+ language support
  • Handles inline math and equations
  • Char, word, and line bboxes available
  • Significantly better chinese performance
  • Very fast - 10000+ tokens/second on A100 with vllm
  • Continuous batching in base configuration for ~2x speedup

Updated benchmarks coming soon.

Examples

Word boxes

image

Math

image

Chinese

image

What's Changed

Full Changelog: v0.13.1...v0.14.0

Don't miss a new surya release

NewReleases is sending notifications on new releases.