✨ New features and improvements
- Allow sourcing disabled components in config.
- Support
Doc.spans
inExample.from_dict
. - Improve transformer recommendations in quickstart widget and
init config
. - Improve language data for Bulgarian.
- Various improvements to error handling and UX.
🔴 Bug fixes
- Fix issue #6952, #7285, #7289: Make
tok2vec
pretraining andpretrain
command work as expected again. - Fix issue #7062: Only evaluate named entities for NEL if there is a corresponding gold span.
- Fix issue #7065: Correctly handle sentence boundaries in
Span.sent
. - Fix issue #7071: Fix
conll
converter option. - Fix issue #7100: Re-add
n_sents
to entity linker and fix config handling and I/O. - Fix issue #7122: Fix displaCy output in
evaluate
CLI. - Fix issue #7127: Fix initialization of
UkrainianLemmatizer
. - Fix issue #7176: Re-refactor
Sentencizer
to usePipe
API. - Fix issue #7182: Allow
SpanGroup
import fromspacy.tokens
. - Fix issue #7204: Adjust Cython compilation for setups with custom include paths.
- Fix issue #7222: Correct YAML formatting in quickstart recommendations for
bg
andbn
. - Fix issue #7225: Fix
spans
weakref inDoc.copy
. - Fix issue #7237: Fix
is_cython_func
for additional imported code. - Fix issue #7250: Fix patience for identical scores.
- Fix issue #7329: Make
spacy.orth_variants.v1
andspacy.lower_case.v1
augmenters work as expected. - Fix issue #7352: Sort
EntityRuler.labels
alphabetically.
📖 Documentation and examples
- Add documentation for
textcat_multilabel
component. - Extend documentation for
Vocab.get_noun_chunks
. - Fix various typos and inconsistencies.
👥 Contributors
Thanks to @MartinoMensio, @SergeyShk, @R1j1t, @palandlom, @dardoria, @tocic, @clippered, @graue70, @koaning and @jankrepl for the pull requests and contributions!