github explosion/spacy-models ja_core_news_sm-2.3.0

Downloads

Details: https://spacy.io/models/ja#ja_core_news_sm

File checksum: 1a375e7339deb3eb4afa28321f545f1f933ce88082aca8294193f1787f1aa7ab

Japanese multi-task CNN trained on UD_Japanese-GSD v2.6-NE. Assigns context-specific token vectors, POS tags, dependency parses and named entities.

Feature Description
Name ja_core_news_sm
Version 2.3.0
spaCy >=2.3.0,<2.4.0
Model size 7 MB
Pipeline  parser, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD_Japanese-GSD v2.6 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD_Japanese-GSD v2.6-NE (Megagon Labs Tokyo)
SudachiPy (Works Applications)
SudachiDict (Works Applications)
License CC BY-SA 4.0
Author Explosion and Megagon Labs Tokyo

Label Scheme

Component Labels
parser  ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner  CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
LAS  86.87
UAS  88.68
TOKEN_ACC  97.67
ENTS_F  59.93
ENTS_P  64.88
ENTS_R  55.68

Installation

pip install spacy
python -m spacy download ja_core_news_sm

Don't miss a new spacy-models release

NewReleases is sending notifications on new releases.