Ssurgeon updates beyond the capabilities listed in the GURT paper
- MergeNodes operation: combine two words into one word in a graph. one word must be a leaf headed by the other for this to work 0660fa9
- CombineMWT operation: mark MWT on two or more words. Stanza will treat these as
Token
010a955 - DeleteLeaf operation: remove a leaf, renumber the subsequent words
429f61a
Bugfixes
- fix graph serialization for sentences longer than 128 words (
IdentityHashSet
doesn't work for integers beyond 128) d8d9d9f - fix
valueOf
forSemanticGraph
if a word is just a dash 203eb06 - fix memory usage of evaluating a PCFG model, which would run out of memory because it was saving all of the charts while evaluating b2e67b0
- Tregex pattern would not correctly display when using optional patterns: a9965b2 8659653
- Tregex would infinite loop on certain optional patterns which were theoretically legal cc7983e
Security fixes
- update xom to 1.3.9, which should avoid unwanted, potentially vulnerable transitive dependencies
c8772b7 - remove bz2 zip & unzip, which used a shell command and therefore could be hijacked https://nvd.nist.gov/vuln/detail/CVE-2023-39020
English dependency converter fixes
- addressing issue #1363
- fix
(QP up to ...)
8c46648 9a86ece - fix
up to 1700 kilograms
if misparsed in a predicable manner 6e14527 - better
LST
coverage 5745de5 vmod/acl
when the parser misinterpretsNP
vsNML
ad4556d- treat lists of
NML
as repeated modifiers of a noun, instead of a list, as that is the likely meaning ofNML
. example:a 72-game, three-month season
from PTB 61ef545 5e748dc
Server features
- Scenegraph endpoint 8b40947 #1346
- remove one json library to reduce number of json libraries we depend on 357b1bb
Small changes
- allow
fourty
as a number in SUTime 7fbb7b8 - capture
forty (40) days
as a duration in SUTime b3c47a0 - feature to print out the feature index of an NER model as a text file f636673
- clarify the INTJ rule for the ChineseHeadFinder 56cd6bb
- consider
{
}
as punctuation when scoring English constituency treebanks a606afa - fix error in test case, from @tanloong #1373 #1372
- dead code cleanup 86b6a03