pypi spacy 1.2.0
v1.2.0: Alpha tokenizers for Chinese, French, Spanish, Italian and Portuguese

latest releases: 4.0.0.dev3, 3.7.4, 3.7.3...
7 years ago

✨ Major features and improvements

  • NEW: Support Chinese tokenization, via Jieba.
  • NEW: Alpha support for French, Spanish, Italian and Portuguese tokenization.

🔴 Bug fixes

  • Fix issue #376: POS tags for "and/or" are now correct.
  • Fix issue #578: --force argument on download command now operates correctly.
  • Fix issue #595: Lemmatization corrected for some base forms.
  • Fix issue #588: Matcher now rejects empty patterns.
  • Fix issue #592: Added exception rule for tokenization of "Ph.D."
  • Fix issue #599: Empty documents now considered tagged and parsed.
  • Fix issue #600: Add missing token.tag and token.tag_ setters.
  • Fix issue #596: Added missing unicode import when compiling regexes that led to incorrect tokenization.
  • Fix issue #587: Resolved bug that caused Matcher to sometimes segfault.
  • Fix issue #429: Ensure missing entity types are added to the entity recognizer.

Don't miss a new spacy release

NewReleases is sending notifications on new releases.