Since v1.0.0b19:
Added:
* Workspace validation: Validate that files mentioned in pc:Page/@imageFilename exist in METS and on FS, #309
Fixed:
* ocrd ocrd-tool parse-params has the string-or-filepath logic for -p/--parameter as for the CLI
Since v0.15.2
Added:
- Spec-conformant handling of AlternativeImage, OCR-D/spec#116, OCR-D/ocrd_tesserocr#33, #284
ocrd workspace list-pageto list all page IDsocrd workspace removeto remove files, #275, #245ocrd workspace remove-groupto remove file groups, #275, #245ocrd workspace prune-files- Workspace validation: Validate that files mentioned in pc:Page/@imageFilename exist in METS and on FS, #309
- utils:
MIME_TO_EXTto map mime types to preferred extension - Validation of imageHeight/imageWidth in PAGE vs. actual image height/width, #229
- image_from_page: allow filtering by feature (@comment), #294
points_from_y0x0y1x1for inverted x/y pairs- many utility methods for image manipulation and coordinate handling, #268, OCR-D/ocrd_tesserocr#49
bbox_from_pointsbbox_from_xywhbbox_from_polygoncoordinates_for_segmentcoordinates_of_segmentcrop_imagemembernameimage_from_polygonpoints_from_bboxpoints_from_polygonpoints_from_xywhpolygon_from_bboxpolygon_from_x0y0x1y1polygon_from_xywhpolygon_maskrotate_coordinatesxywh_from_bbox
Fixed:
- Handle TIFF ResolutionUnit not being set #250
- bashlib:
--mets-fileshould be--mets ocrd workspace set-idcase in argument error- fix DeprecationWarning for PyYAML 5.1+
- use headless opencv
- Regression with ocrd_page data types, #269
- Segfault issue with Pillow >= 6.0.0, #270
ocrd ocrd-tool parse-paramshas the string-or-filepath logic for -p/--parameter as for the CLI- Workspace: Simplify file download code, add extensions to files
- Processor:
chdirto workspace directory on init so relative files resolve properly - typos in docstrings
- README: 'module' -> 'package'
- workspace.image_from_page: logic with rotation/angle
- Adapted test suite to OCR-D/assets now with file extensions
- Require
Pillow == 5.4.1throughout - regression in namespace handling of PAGE output, #277
- METS is serialized as Unicode instead of character entities, #279
Changed:
- 🔥 Drop Python2 support
- 🔥 Refactored project into 5 modules with little dependencies each
- Implement 3.2.0 of the spec
- OcrdFile: Default fileGrp to
TEMP - OcrdFile: Accept url constructor arg
- Extended page with TextStyle for Page, , PRImA-Research-Lab/PAGE-XML#8
-m/--metsis not required anymore, #301ocrd workspace prune-files: Throw on error removing non-existant file-p/--parameterargument accepts raw JSON as well now, #239- workspace bagger will create files with extension
- export additional region types from generated code, #241
save_metsis atomic now, #278, #285- missing required parameters should raise exception, fix #244 #247
- Improve pixel density logic in OcrdExif, #256, #37, OCR-D/ocrd_tesserocr#54
- 🔥 stop supporting python
<= 3.4 - Support only 2019-07-15 PAGE version
Removed:
- 🔥 Move factory methods from OcrdPage and OcrdExif to new module
ocrd.model_factory - Factor out XML constants to
ocrd.constants.xml - 🔥 BaseProcessor.add_output_file removed