github explosion/spaCy v3.3.1
New Span Ruler component, JSON (de)serialization of Doc, span analyzer and more

latest releases: release-v3.8.2, release-v3.8.0, release-v3.7.7...
2 years ago

✨ New features and improvements

🔴 Bug fixes

  • Fix issue #9575: Fix Entity Linker with tokenization mismatches between gold and predicted Doc objects.
  • Fix issue #10685: Fix serialization of SpanGroup objects that share the same name within one SpanGroups container.
  • Fix issue #10718: Remove debug print statements in walk_head_nodes to avoid acquiring the GIL.
  • Fix issue #10741: Make the StringStore.__getitem__ return type dependent on its parameter type.
  • Fix issue #10734: Support removal of overlapping terms in PhraseMatcher.
  • Fix issue #10772: Override SpanGroups.setdefault to also support Iterable[SpanGroup] as the default.
  • Fix issue #10817: Ensure that the term ROOT is in the glossary.
  • Fix issue #10830: Better errors for Doc.has_annotation and Matcher.
  • Fix issue #10864: Avoid pickling Doc inputs passed to Language.pipe().
  • Fix issue #10898: Fix schemas import in Doc.

⚠️ Backward incompatibilities

  • Before this release, a validation bug allowed the configuration of a pipeline component to override the name of the pipeline itself through the name attribute. For example, the following pipeline component:

    [components.transformer]
    factory = "transformer"
    name = "custom_transformer_name"

    would be registered erroneously as custom_transformer_name. Such overrides are now ignored and a warning is emitted (#10779). From spaCy v3.3.1 onwards, this component will be registered as transformer.

👥 Contributors

@adrianeboyd, @danieldk, @freddyheppell, @honnibal, @ines, @kadarakos, @ldorigo, @ljvmiranda921, @maxTarlov, @pmbaumgartner, @polm, @pypae, @richardpaulhudson, @rmitsch, @shadeMe, @single-fingal, @svlandeg

Don't miss a new spaCy release

NewReleases is sending notifications on new releases.