CoreNLP 4.5.1
Bugfixes!
- Fix tokenizer regression: 4.5.0 will tokenize ",5" as one word 974383a
- Use a
LinkedHashMap
in the PTBTokenizer instead ofProperties
. Keeps the option processing order predictable. #1289 6550188 - Fix
\r\n
not being properly processed on Windows: #1291 9889f4e - Handle one half of surrogate character pairs in the tokenizer w/o crashing #1298 1b12faa
- Attempt to fix semgrex "Unknown vertex" errors which have plagued CoreNLP for years in hard to track down circumstances: #1296 #1229 #1169 f99b5ab