This release patch fixes several bugs, introduces OCR datasets and improves model performances.

Release handled by @fg-mindee & @charlesmindee

Note: doctr 0.1.1 requires TensorFlow 2.3.0 or higher.

Highlights

Introduction of vision datasets

Whether this is for training or evaluation purposes, DocTR provides you with objects to easily download and manipulate datasets. Access OCR datasets within a few lines of code:

from doctr.datasets import FUNSD
train_set = FUNSD(train=True, download=True)
img, target = train_set[0]

Model evaluation

While DocTR 0.1.0 gave you access to pretrained models, you had no way to find the performances of these models apart from computing them yourselves. As of now, we have added a performance benchmark in our documentation for all our models and made the evaluation script available for seamless reproducibility:

python scripts/evaluate.py ocr_db_crnn_vgg

Demo app

Since we want to make DocTR a convenience for you to build OCR-related applications and services, we made a minimal Streamlit demo app to showcase its text detection capabilities. You can run the demo with the following commands:

streamlit run demo/app.py

Here is how it renders performing text detection on a sample document:

Breaking changes

Metric update & summary

For improved clarity, the evaluation metrics' methods were renamed.

0.1.0	0.1.1
`>>> from doctr.utils import ExactMatch` `>>> metric = ExactMatch()` `>>> metric.update_state(['Hello', 'world'], ['hello', 'world'])` `>>> metric.result()`	`>>> from doctr.utils import ExactMatch` `>>> metric = ExactMatch()` `>>> metric.update(['Hello', 'world'], ['hello', 'world'])` `>>> metric.summary()`

Renaming of high-level predictors

As the range of backbones and combinations evolves, we have updated the name of high-level predictors:

0.1.0	0.1.1
`>>> from doctr.models import ocr_db_crnn`	`>>> from doctr.models import ocr_db_crnn_vgg`

New features

Datasets

Easy-to-use datasets for OCR

Added predefined vocabs (#116)
Added string encoding/decoding utilities (#116)
Added FUNSD dataset (#136, #141)

Models

Deep learning model building and inference

Added ResNet-31 backbone to SAR (#132) and CRNN (#148)

Utils

Utility features relevant to the library use cases.

Added localization (#117) & end-to-end OCR (#122, #141) metrics

Test

Verifications of the package well-being before release

Added unittests for evaluation metrics (#117, #122)
Added unittests for string encoding/decoding (#116)
Added unittests for datasets (#136, #141)
Added unittests for pretrained crnn_resnet31 (#148), and OCR predictors (#150)

Documentation

Online resources for potential users

Added pypi badge to README (#114)
Added pypi installation instructions to documentation (#114)
Added evaluation metric section (#117, #122, #158)
Added multi-version documentation deployment (#123)
Added datasets page in documentation (#136, #154)
Added performance benchmark on FUNSD in documentation (#143, #149, #150, #155)
Added instructions in README to run the demo app (#146)
Added sar_resnet31 to recognition models documentation (#150)

Others

Other tools and implementations

Added default label to bug report issues (#121)
Updated CI job for documentation build (#123)
Added CI job to ensure analyze.py script runs (#142)
Added evaluation script (#141, #145, #151)
Added text detection demo app (#146)

Bug fixes

Models

Fixed no-detection predictor export (#119)
Fixed edge case of polygon to box computation (#139)
Fixed DB bitmap_to_boxes method (#155)

Utils

Fixed typo in ExactMatch (#120)
Fixed IoU computation when boxes are distant (#140)

Test

Documentation

Fixed docstring examples of predictors (#126)
Fixed multi-version documentation build (#138)
Fixed docstrings of VisionDataset and FUNSD (#147)
Fixed usage instructions in README (#150)
Fixed installation instructions in documentation (#154)

Others

Fixed pypi release CI job (#153)

Improvements

Models

Added dimension check on predictor's inputs (#126)
Updated pretrained DBNet URLs (#129, #150)
Improved DBNet post-processing (#130, #150, #155, #157)
Moved normalization parameters to config (#133, #150)
Refactored file downloading (#136)
Increased default batch size for recognition (#143)
Updated max_length and input_shape of SAR (#143)
Added support of absolute coordinates for crop extraction (#145)
Added proper kernel sizing to silence TF unresolved checkpoints warnings (#152, #156)

Utils

Renamed state updating and summarizing methods of metrics (#117)
Updated text distance computation backend (#128)
Simplified repr of NestedObject when they have no children (#137)

Documentation

Cleaned README prerequisites & URLs (#125)
Added usage example for images in README (#125)
Updated installation instructions in README (#154)
Added docstring examples to FUNSD (#154)
Added docstring examples to evaluation metrics (#154)

Others

Updated environment collection script & bug report template (#135)
Enabled GPU on analyze.py script (#141)

mindee/doctr v0.1.1 v0.1.1: OCR datasets and improved model performances on GitHub

Highlights

Introduction of vision datasets

Model evaluation

Demo app

Breaking changes

Metric update & summary

Renaming of high-level predictors

New features

Datasets

Models

Utils

Test

Documentation

Others

Bug fixes

Models

Utils

Test

Documentation

Others

Improvements

Models

Utils

Documentation

Others

mindee/doctr v0.1.1
v0.1.1: OCR datasets and improved model performances

on GitHub