github explosion/spacy-models ja_core_news_lg-2.3.0

Downloads

Details: https://spacy.io/models/ja#ja_core_news_lg

File checksum: 607aaef70e011b35b15470ac3fc8a9a40dc76ffe1fce03a8f4f2fbc12b30d8d2

Japanese multi-task CNN trained on UD_Japanese-GSD v2.6-NE. Assigns word2vec token vectors, POS tags, dependency parses and named entities.

Feature Description
Name ja_core_news_lg
Version 2.3.0
spaCy >=2.3.0,<2.4.0
Model size 526 MB
Pipeline  parser, ner
Vectors 480443 keys, 480443 unique vectors (300 dimensions)
Sources UD_Japanese-GSD v2.6 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD_Japanese-GSD v2.6-NE (Megagon Labs Tokyo)
chiVe: Japanese Word Embedding with Sudachi & NWJC (chive-1.1-mc90-500k) (Works Applications)
SudachiPy (Works Applications)
SudachiDict (Works Applications)
License CC BY-SA 4.0
Author Explosion and Megagon Labs Tokyo

Label Scheme

Component Labels
parser  ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner  CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
LAS  87.55
UAS  88.94
TOKEN_ACC  97.67
ENTS_F  70.48
ENTS_P  71.79
ENTS_R  69.22

Installation

pip install spacy
python -m spacy download ja_core_news_lg

Don't miss a new spacy-models release

NewReleases is sending notifications on new releases.