Breaking Changes
createWorker
is now async- In most code this means
worker = Tesseract.createWorker()
should be replaced withworker = await Tesseract.createWorker()
- Calling with invalid
workerPath
orcorePath
now produces error/rejected promise (#654)
- In most code this means
worker.load
is no longer needed (createWorker
now returns worker pre-loaded)getPDF
function replaced bypdf
recognize option (#488)
Major New Features
- Processed images created by Tesseract can be retrieved using
imageColor
,imageGrey
, andimageBinary
options (#588)- See image-processing.html example for usage
- Image rotation options
rotateAuto
androtateRadians
have been added, which significantly improve accuracy on certain documents- See Issue #648 example of how auto-rotation improves accuracy
- See image-processing.html example for usage of
rotateAuto
option
- Tesseract parameters (usually set using
worker.setParameters
) can now be set for single jobs usingworker.recognize
options (#665)- For example, a single job can be set to recognize only numbers using
worker.recognize(image, {tessedit_char_whitelist: "0123456789"})
- As these settings are reverted after the job, this allows for using different parameters for specific jobs when working with schedulers
- For example, a single job can be set to recognize only numbers using
- Initialization parameters (e.g.
load_system_dawg
,load_number_dawg
, andload_punc_dawg
) can now be set (#613)- The third argument to
worker.initialize
now accepts either (1) an object with key/value pairs or (2) a string containing contents to write to a config file - For example, both of these lines set
load_number_dawg
to 0:worker.initialize('eng', "0", {load_number_dawg: "0"});
worker.initialize('eng', "0", "load_number_dawg 0");
- The third argument to
Other Changes
loadLanguage
now resolves without error when language is loaded but writing to cache fails- This allows for running in Firefox incognito mode using default settings (#609)
detect
returnsnull
values when OS detection fails rather than throwing error (#526)- Memory leak causing crashes fixed (#678)
- Cache corruption should now be much less common (#666)
New Contributors
- @reda-alaoui made their first contribution in #570
Full Changelog: v3.0.3...v4.0.0