pypi spacy 1.9.0
v1.9.0: Spanish model, alpha support for Norwegian & Japanese, and bug fixes

latest releases: 4.0.0.dev3, 3.7.4, 3.7.3...
6 years ago

Thanks to all of you for 5,000 stars on GitHub, the valuable feedback in the user survey and testing spaCy v2.0 alpha. We're working hard on getting the new version ready and can't wait to release it. In the meantime, here's a new release for the 1.x branch that fixes a variety of outstanding bugs and adds capabilities for new languages.

💌 P.S.: If you haven't gotten your hands on a set of spaCy stickers yet, you can still do so – send us a DM with your address on Twitter or Gitter, and we'll mail you some!


✨ Major features and improvements

  • NEW: The first official Spanish model (377 MB) including vocab, syntax, entities and word vectors. Thanks to the amazing folks at recogn.ai for the collaboration!
python -m spacy download es
nlp = spacy.load('es')
doc = nlp(u'Esto es una frase.')
  • NEW: Alpha tokenization for Norwegian Bokmål and Japanese (via Janome).
  • NEW: Allow dropout training for Parser and EntityRecognizer, using the drop keyword argument to the update() method.
  • NEW: Glossary for POS, dependency and NER annotation scheme via spacy.explain(). For example, spacy.explain('NORP') will return "Nationalities or religious or political groups".
  • Improve language data for Dutch, French and Spanish.
  • Add Language.parse_tree method to generate POS tree for all sentences in a Doc.

🔴 Bug fixes

  • Fix issue #1031: Close gaps in Lexeme API.
  • Fix issue #1034: Add annotation scheme glossary and spacy.explain().
  • Fix issue #1051: Improved error messaging when trying to load non-existing model.
  • Fix issue #1052: Add missing SP symbol to tag map.
  • Fix issue #1061: Add flush_cache method to tokenizer.
  • Fix issue #1069: Fix Doc.sents iterator when customised with generator.
  • Fix issue ##1099, #1143: Improve documentation on models in requirements.txt.
  • Fix issue #1137: Use lower min version for requests dependency.
  • Fix issue #1207: Fix Span.noun_chunks.
  • Fix issue with six and its dependencies that occasionally caused spaCy to fail.
  • Fix typo in package command that caused error when printing error messages.

📖 Documentation and examples

  • Fix various typos and inconsistencies.
  • NEW: spaCy 101 guide for v2.0: all important concepts, explained with examples and illustrations. Note that some of the behaviour and examples are specific to v2.0+ – but the NLP basics are relevant independent of the spaCy version you're using.

👥 Contributors

Thanks to @kengz, @luvogels, @ferdous-al-imran, @uetchy, @akYoung, @pasupulaphani, @dvsrepo, @raphael0202, @yuvalpinter, @frascuchon, @kootenpv, @oroszgy, @bartbroere, @ianmobbs, @garfieldnate, @polm, @callumkift, @swierh, @val314159, @lgenerknol and @jsparedes for the contributions!

Don't miss a new spacy release

NewReleases is sending notifications on new releases.