github explosion/spacy-models zh_core_web_lg-2.3.1

Downloads

Details: https://spacy.io/models/zh#zh_core_web_lg

File checksum: 9406282662f27083a5b09a864c23ff4436d2f5602f2a3c963f601ef2d999fd9f

Chinese multi-task CNN trained on OntoNotes. Assigns word vectors, POS tags, dependency parse and named entities. Word vectors trained using FastText CBOW on Wikipedia and OSCAR (Common Crawl).

Feature Description
Name zh_core_web_lg
Version 2.3.1
spaCy >=2.3.0,<2.4.0
Model size 575 MB
Pipeline  tagger, parser, ner
Vectors 500000 keys, 500000 unique vectors (300 dimensions)
Sources OntoNotes 5
OSCAR (Common Crawl)
Wikipedia (20200301)
License MIT
Author Explosion

Label Scheme

Component Labels
tagger  AD, AS, BA, CC, CD, CS, DEC, DEG, DER, DEV, DT, ETC, FW, IJ, INF, JJ, LB, LC, M, MSP, NN, NR, NT, OD, ON, P, PN, PU, SB, SP, URL, VA, VC, VE, VV, X, _SP
parser  ROOT, acl, advcl:loc, advmod, advmod:dvp, advmod:loc, advmod:rcomp, amod, amod:ordmod, appos, aux:asp, aux:ba, aux:modal, aux:prtmod, auxpass, case, cc, ccomp, compound:nn, compound:vc, conj, cop, dep, det, discourse, dobj, etc, mark, mark:clf, name, neg, nmod, nmod:assmod, nmod:poss, nmod:prep, nmod:range, nmod:tmod, nmod:topic, nsubj, nsubj:xsubj, nsubjpass, nummod, parataxis:prnmod, punct, xcomp
ner  CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
LAS  64.99
UAS  69.77
TOKEN_ACC  94.58
TAGS_ACC  90.55
ENTS_F  69.33
ENTS_P  70.81
ENTS_R  67.91

Installation

pip install spacy
python -m spacy download zh_core_web_lg

Don't miss a new spacy-models release

NewReleases is sending notifications on new releases.