This release patch fixes several bugs, introduces OCR datasets and improves model performances.
Release handled by @fg-mindee & @charlesmindee
Note: doctr 0.1.1 requires TensorFlow 2.3.0 or higher.
Highlights
Introduction of vision datasets
Whether this is for training or evaluation purposes, DocTR provides you with objects to easily download and manipulate datasets. Access OCR datasets within a few lines of code:
from doctr.datasets import FUNSD
train_set = FUNSD(train=True, download=True)
img, target = train_set[0]
Model evaluation
While DocTR 0.1.0 gave you access to pretrained models, you had no way to find the performances of these models apart from computing them yourselves. As of now, we have added a performance benchmark in our documentation for all our models and made the evaluation script available for seamless reproducibility:
python scripts/evaluate.py ocr_db_crnn_vgg
Demo app
Since we want to make DocTR a convenience for you to build OCR-related applications and services, we made a minimal Streamlit demo app to showcase its text detection capabilities. You can run the demo with the following commands:
streamlit run demo/app.py
Here is how it renders performing text detection on a sample document:
Breaking changes
Metric update & summary
For improved clarity, the evaluation metrics' methods were renamed.
| 0.1.0 | 0.1.1 |
|---|---|
>>> from doctr.utils import ExactMatch >>> metric = ExactMatch() >>> metric.update_state(['Hello', 'world'], ['hello', 'world'])>>> metric.result()
| >>> from doctr.utils import ExactMatch >>> metric = ExactMatch()>>> metric.update(['Hello', 'world'], ['hello', 'world'])>>> metric.summary()
|
Renaming of high-level predictors
As the range of backbones and combinations evolves, we have updated the name of high-level predictors:
| 0.1.0 | 0.1.1 |
|---|---|
>>> from doctr.models import ocr_db_crnn
| >>> from doctr.models import ocr_db_crnn_vgg
|
New features
Datasets
Easy-to-use datasets for OCR
- Added predefined vocabs (#116)
- Added string encoding/decoding utilities (#116)
- Added
FUNSDdataset (#136, #141)
Models
Deep learning model building and inference
Utils
Utility features relevant to the library use cases.
Test
Verifications of the package well-being before release
- Added unittests for evaluation metrics (#117, #122)
- Added unittests for string encoding/decoding (#116)
- Added unittests for datasets (#136, #141)
- Added unittests for pretrained
crnn_resnet31(#148), and OCR predictors (#150)
Documentation
Online resources for potential users
- Added pypi badge to README (#114)
- Added pypi installation instructions to documentation (#114)
- Added evaluation metric section (#117, #122, #158)
- Added multi-version documentation deployment (#123)
- Added datasets page in documentation (#136, #154)
- Added performance benchmark on
FUNSDin documentation (#143, #149, #150, #155) - Added instructions in README to run the demo app (#146)
- Added
sar_resnet31to recognition models documentation (#150)
Others
Other tools and implementations
- Added default label to bug report issues (#121)
- Updated CI job for documentation build (#123)
- Added CI job to ensure
analyze.pyscript runs (#142) - Added evaluation script (#141, #145, #151)
- Added text detection demo app (#146)
Bug fixes
Models
- Fixed no-detection predictor export (#119)
- Fixed edge case of polygon to box computation (#139)
- Fixed DB
bitmap_to_boxesmethod (#155)
Utils
Test
Documentation
- Fixed docstring examples of predictors (#126)
- Fixed multi-version documentation build (#138)
- Fixed docstrings of
VisionDatasetandFUNSD(#147) - Fixed usage instructions in README (#150)
- Fixed installation instructions in documentation (#154)
Others
- Fixed pypi release CI job (#153)
Improvements
Models
- Added dimension check on predictor's inputs (#126)
- Updated pretrained DBNet URLs (#129, #150)
- Improved DBNet post-processing (#130, #150, #155, #157)
- Moved normalization parameters to config (#133, #150)
- Refactored file downloading (#136)
- Increased default batch size for recognition (#143)
- Updated
max_lengthandinput_shapeof SAR (#143) - Added support of absolute coordinates for crop extraction (#145)
- Added proper kernel sizing to silence TF unresolved checkpoints warnings (#152, #156)
Utils
- Renamed state updating and summarizing methods of metrics (#117)
- Updated text distance computation backend (#128)
- Simplified repr of
NestedObjectwhen they have no children (#137)
Documentation
- Cleaned README prerequisites & URLs (#125)
- Added usage example for images in README (#125)
- Updated installation instructions in README (#154)
- Added docstring examples to
FUNSD(#154) - Added docstring examples to evaluation metrics (#154)