github stanfordnlp/CoreNLP v4.5.1
v4.5.1: Bugfixes

latest releases: v4.5.7, v4.5.6, v4.5.5...
23 months ago

CoreNLP 4.5.1

Bugfixes!

  • Fix tokenizer regression: 4.5.0 will tokenize ",5" as one word 974383a
  • Use a LinkedHashMap in the PTBTokenizer instead of Properties. Keeps the option processing order predictable. #1289 6550188
  • Fix \r\n not being properly processed on Windows: #1291 9889f4e
  • Handle one half of surrogate character pairs in the tokenizer w/o crashing #1298 1b12faa
  • Attempt to fix semgrex "Unknown vertex" errors which have plagued CoreNLP for years in hard to track down circumstances: #1296 #1229 #1169 f99b5ab

Don't miss a new CoreNLP release

NewReleases is sending notifications on new releases.