github stanfordnlp/CoreNLP v4.5.6
v4.5.6: Lemmatizer & Tokenizer bugfixes

latest release: v4.5.7
3 months ago

English Lemmatizer upgrades

  • enroll, appall as American spellings, instead of enrol & appal. de- as a verb prefix, blog and xfer as double letter exceptions 8adcbfe
  • cowritten 2dd08da
  • elder / eldest 9b5bec8
  • Yazidi as a demonym 2852da8

Tokenizer upgrades

  • #number as a single thing after an abbreviation #1396 ad37f2a

UD Processing upgrades

  • 'twas and 'tis as MWT in the UD converter b9f19a6
  • Sort morpho features in alphabetical order when writing out UD
    f77a9b4

Other Bugfixes

  • Crash when deleting the endpoints of an IntervalTree #1405 6d17c23
  • Find and remove extraneous uses of yield, which became a keyword: e5c9d44 b084233

Minor API change

  • Updating the text on a CoreLabel no longer wipes out the Lemma c03522b
  • Update to more recent Jakarta Servlet 8a671fd

Ssurgeon

  • UpdateMorphoFeatures edit 27c6703
  • Lemmatize operation (only works on English) c26b25e

Don't miss a new CoreNLP release

NewReleases is sending notifications on new releases.