github mindee/doctr v0.1.0
v0.1.0: Pretrained models for seamless end-to-end OCR

latest releases: v1.0.0, v0.12.0, v0.11.0...
4 years ago

This first release adds pretrained models for end-to-end OCR and document manipulation utilities.

Release handled by @fg-mindee & @charlesmindee

Note: doctr 0.1.0 requires TensorFlow 2.3.0 or newer.

Highlights

Easy & high-performing document reading

Since document processing is at the core of this project, being able to read documents efficiently is a priority. In this release, we considered PDF and image-based files.

PDF reading is a wrapper around PyMuPDF back-end for fast file reading

from doctr.documents import read_pdf
# from path
doc = read_pdf("path/to/your/doc.pdf")
# from stream
with open("path/to/your/doc.pdf", 'rb') as f:
    doc = read_pdf(f.read())

while image reading is using OpenCV backend

from doctr.documents import read_img
page = read_img("path/to/your/img.jpg")

Pretrained End-to-End OCR predictors

Whether you conduct text detection, text recognition or end-to-end OCR, this release brings you pretrained models and advanced predictors (that will take care of all preprocessing, model inference and post-processing for you) for easy-to-use pythonic features

Text detection

Currently, only DBNet-based architectures are supported, more to come in the next releases!

from doctr.documents import read_pdf
from doctr.models import db_resnet50_predictor
model = db_resnet50_predictor(pretrained=True)
doc = read_pdf("path/to/your/doc.pdf")
result = model(doc)

Text recognition

There are two architectures implemented for recognition: CRNN, and SAR

from doctr.models import crnn_vgg16_bn_predictor
model = crnn_vgg16_bn_predictor(pretrained=True)

End-to-End OCR

Simply combining two models into a two-stage architecture, OCR predictors bring you the easiest way to analyze your document

from doctr.documents import read_pdf
from doctr.models import ocr_db_crnn

model = ocr_db_crnn(pretrained=True)
doc = read_pdf("path/to/your/doc.pdf")
result = model([doc])

New features

Documents

Documentation reading and manipulation

Models

Deep learning model building and inference

Utils

Utility features relevant to the library use cases.

  • Added page interactive prediction visualization (#54, #82)
  • Added custom types (#87)
  • Added abstract auto-repr object (#102)
  • Added metric module (#110)

Test

Verifications of the package well-being before release

Documentation

Online resources for potential users

Others

Other tools and implementations

  • Added python package setup (#7, #21, #67)
  • Added CI verifications (#7, #67, #69, #73)
  • Added dockerized environment with library installed (#17, #19)
  • Added issue template (#34)
  • Added environment collection script (#81)
  • Added analysis script (#85, #95, #103)

Don't miss a new doctr release

NewReleases is sending notifications on new releases.