github explosion/spaCy v3.2.2
v3.2.2: Improved NER and parser speeds, bug fixes and more

latest releases: release-v3.7.7, release-v3.8.1, release-v3.8.0...
2 years ago

✨ New features and improvements

  • Improved parser and ner speeds on long documents (see technical details in #10019).
  • Support for spancat components in debug data.
  • Support for ENT_IOB as a Matcher token pattern key.
  • Extended and improved types for many classes.

🔴 Bug fixes

  • Fix issue #9735: Make floret murmurhash endian-neutral.
  • Fix issue #9738: Support string IOB values for ENT_IOB.
  • Fix issue #9746: Updates to avoid "dictionary size changed during iteration" runtime errors.
  • Fix issue #9960: Warn about entities that cross sentence boundaries in debug data.
  • Fix issue #9979: Fix type for Lexeme.rank.
  • Fix issue #10026: Check for 0-size assets in spacy project.
  • Fix issue #10051: Consistently return scalars from similarity methods.
  • Fix issue #10052: Fix spaces in Doc.from_docs() for empty docs.
  • Fix issue #10079: Fix label detection in debug data for components with custom names.
  • Fix issue #10109: Add types to Underscore and DependencyMatcher and improve types in Language, Matcher and PhraseMatcher.
  • Fix issue #10130: Fix Tokenizer.explain when infixes appear as prefixes.
  • Fix issue #10143: Use simple suggester in spancat initialization.
  • Fix issue #10164: Support IS_SENT_END in Doc.has_annotation.
  • Fix issue #10192: Detect invalid package names in spacy package.
  • Fix issue #10223: Support mixed case in package names.
  • Fix issue #10234: Fix type in PhraseMatcher.

📖 Documentation and examples

  • Various documentation updates.
  • New spaCy version tags in spaCy universe.
  • New Dockerfile for repeatable website builds and easier local development.
  • New additions to spaCy universe:
    • Augmenty: a text augmentation library
    • Healthsea: an end-to-end spaCy pipeline for exploring health supplement effects
    • spacy-wrap: wrap fine-tuned transformers in spaCy pipelines
    • spacypdfreader: easy PDF to text to spaCy text extraction
    • textnets: text analysis with networks

👥 Contributors

@adrianeboyd, @antonpibm, @ColleterVi, @danieldk, @DuyguA, @ezorita, @HaakonME, @honnibal, @ines, @jboynyc, @KennethEnevoldsen, @ljvmiranda921, @mrshu, @pmbaumgartner, @polm, @ramonziai, @richardpaulhudson, @ryndaniels, @svlandeg, @thiippal, @thomashacker, @yoavxyoav

Don't miss a new spaCy release

NewReleases is sending notifications on new releases.