✨ Major features and improvements
- Add
Span.sentiment
attribute. - #658: Add
Span.noun_chunks
iterator (thanks @pokey). - #642: Let
--data-path
be specified when running download.py scripts (thanks @ExplodingCabbage). - #638: Add German stopwords (thanks @souravsingh).
- #614: Fix
PhraseMatcher
to work with newMatcher
(thanks @sadovnychyi).
🔴 Bug fixes
- Fix issue #605:
accept
argument toMatcher
now rejects matches as expected. - Fix issue #617:
Vocab.load()
now works with string paths, as well asPath
objects. - Fix issue #639: Stop words in
Language
class now used as expected. - Fix issues #656, #624:
Tokenizer
special-case rules now support arbitrary token attributes.
📖 Documentation and examples
- Add "Customizing the tokenizer" workflow.
- Add "Training the tagger, parser and entity recognizer" workflow.
- Add "Entity recognition" workflow.
- Fix various typos and inconsistencies.
👥 Contributors
Thanks to @pokey, @ExplodingCabbage, @souravsingh, @sadovnychyi, @manojsakhwar, @TiagoMRodrigues, @savkov, @pspiegelhalter, @chenb67, @kylepjohnson, @YanhaoYang, @tjrileywisc, @dechov, @wjt, @jsmootiv and @blarghmatey for the pull requests!