✨ New features and improvements
- Make
max_length
of input text inclusive. - Raise error when setting overlapping entities as
doc.ents
. - Improve French lemmatization and check if a word is in one of the regular lists specific to each part-of-speech tag.
🔴 Bug fixes
- Fix issue #1581, #1969, #1986: Fix out-of-bounds access in NER training that'd cause segmentation fault.
- Fix issue #2924: Prevent problem where
displacy
arcs would receive the same IDs in Jupyter notebooks, causing weirdly positioned arc labels. - Fix issue #2948: Fix problem with symlink creation on Windows.
📖 Documentation and examples
- Fix various typos and inconsistencies.
- Update spaCy Universe with new projects.
- Add example script showing a fix-up rule for whitespace entities like
'\n'
.
👥 Contributors
Thanks to @digest0r, @BramVanroy, @grivaz, @wannaphongcom, @mikelibg, @danielhers, @frascuchon, @mauryaland and @cicorias for the pull requests and contributions.