pypi spacy 2.0.3
v2.0.3: Improvements to tokenizer caching and serialization, plus various bug fixes

latest releases: 4.0.0.dev3, 3.7.4, 3.7.3...
6 years ago

✨ New features and improvements

  • Require Thinc v6.10.1 to fix GPU installation fix and beam parsing.
  • Improve Turkish stop words.
  • Improve Hindi stop words.

🔴 Bug fixes

  • Fix issue #1248: Update English tokenizer and norm exceptions for "-in" and "-in'" verbs.
  • Fix issue #1506: Fix KeyError from cleaning up strings during Language.pipe (work in progress).
  • Fix issue #1521: Ensure path in Doc.to_disk and Doc.from_disk.
  • Fix issue #1525, #1582: Update fastText example to accommodate whitespace.
  • Fix issue #1541: Remove broken link from documentation.
  • Fix issue #1546: Add missing import to make util.minibatch work correctly.
  • Fix issue #1557: Add dummy serialization methods to Japanese tokenizer to allow saving and loading models.
  • Fix caching in Tokenizer (partially addresses performance regression in #1371 and #1508).

📖 Documentation and examples

👥 Contributors

Thanks to @MathiasDesch, @mcsalgado, @Wahib, @ligser, @abhi18av, @DuyguA, @KMLDS and @yogendrasoni for the pull requests and contributions.

Don't miss a new spacy release

NewReleases is sending notifications on new releases.