pypi pdfplumber 0.9.0
v0.9.0

latest releases: 0.11.4, 0.11.3, 0.11.2...
17 months ago

Changed

  • Make word segmentation (via WordExtractor.char_begins_new_word(...)) more explict and rigorous; should help in catching edge-cases in the future. (6acd580 + ebb93ea + #840)
  • Use curve_edge objects (instead of just line and rect_edge objects) in default table-detection strategy. (6f6b465 + #858)
  • By default, expand ligatures into their consituent letters (e.g., to ffi), and add the expand_ligatures boolean parameter to text-extraction methods. (86e935d + #598)

Added

  • Add Page.extract_text_lines(...) method. (4b37397 + #852)
  • Add main_group, return_groups, return_chars parameters to Page.search(...). (4b37397)
  • Add .curve_edges property to PDF and Page. (6f6b465)

Fixed

  • Fix handling of bytes-typed fontnames. (9441ff7 + #461 + #842)
  • Fix handling of whitespace-only and empty results of Page.search(...). (6f6b465 + #853)

Don't miss a new pdfplumber release

NewReleases is sending notifications on new releases.