explosion/spaCy v2.3.3 on GitHub

✨ New features and improvements

NEW: Add alpha support for Macedonian and Sanskrit.
Update language data for Croatian, Czech, English, Hebrew, Hindi, Indonesian, Swedish, Thai and Turkish.
Add support for aarch64 and ppc64le on linux with binary packages available on conda-forge.

🔴 Bug fixes

Fix issue #5610: Make sure sys.argv exists.
Fix issue #5643: Add ent_id_ to strings serialized with Doc.
Fix issue #5727: Clarify warning for misaligned BILUO tags.
Fix issue #5768: Improve tag map initialization and updating.
Fix issue #5794: Improve warnings around normalization tables.
Fix issue #5796: Update invalid tag maps.
Fix issue #5799: Remove hard-coded GPU ID from pretrain.
Fix issue #5802: Mark Japanese documents as tagged.
Fix issue #5823: Fix typo in unit tests.
Fix issue #5838: Fix EntityRenderer to support break lines (after last entity).
Fix issue #5843: Prefer earlier spans in EntityRuler.
Fix issue #5849: Allow Doc.char_span to snap to token boundaries.
Fix issue #5853: Fix span boundary handling in Spanish noun chunks.
Fix issue #5861: Add Span index boundary checks.
Fix issue #5904: Fix typos in comments.
Fix issue #5910: Update default sentencizer characters for Armenian, Greek and Arabic.
Fix issue #6014: Fix off-by-one error for best iteration calculation.
Fix issue #6112: Fix overlapping German noun chunks.
Fix issue #6148: Identify final Matcher pattern node by quantifier.
Fix issue #6164: Reorder so tag map is replaced only if a custom file is provided.
Fix issue #6218: Reproducibility for TextCategorizer and Tok2Vec.
Fix issue #6219: Add re-enabled pipe names back to the meta before serializing.
Fix issue #6300: Fix on_match callback and exclude empty match lists from results for DependencyMatcher.
Fix issue #6347: Memory leak issues with beam_parse (requires thinc>=7.4.3).
Fix issue #6373: Bugfix textcat reproducibility on GPU (requires thinc>=7.4.3).
Fix issue #6405: Add all vectors to vocab before pruning.
Fix issue #6413: Use int8_t instead of char in Matcher.

👥 Contributors

Thanks to @abchapman93, @baranitharan2020, @bittlingmayer, @bjascob, @borijang, @BramVanroy, @chopeen, @danielvasic, @delzac, @DuyguA, @erip, @florijanstamenkovic, @graue70, @hiroshi-matsuda-rit, @holubvl3, @idoshr, @jgutix, @KKsharma99, @leyendecker, @lizhe2004, @MartinoMensio, @nipunsadvilkar, @Nuccy90, @oculusrepairo, @rahul1990gupta, @rasyidf, @robertsipek, @SamEdwardes, @snsten, @solarmist, @Stannislav, @tamuhey, @tilusnet, @vha14, @wannaphong, @zaibacu for the pull requests and contributions.

explosion/spaCy v2.3.3 v2.3.3: Alpha support for Macedonian and Sanskrit, updates for many languages and bug fixes on GitHub

✨ New features and improvements

🔴 Bug fixes

👥 Contributors

explosion/spaCy v2.3.3
v2.3.3: Alpha support for Macedonian and Sanskrit, updates for many languages and bug fixes

on GitHub