Checksum .tar.gz:
58f60ecb0c30cfcfbd0f7080550dc7ce86c2c4bcdad4c4086983e956c24b4e16
Checksum .whl:127f9e8018673fd5a599f22e2e0fa019814c75ec43a6cf1274312a1314433c6d
Details: https://spacy.io/models/ja#ja_core_news_trf
Japanese transformer pipeline (Transformer(name='cl-tohoku/bert-base-japanese-char-v2', piece_encoder='char', stride=160, type='bert', width=768, window=216, vocab_size=6144)). Components: transformer, morphologizer, parser, ner.
Feature | Description |
---|---|
Name | ja_core_news_trf
|
Version | 3.7.1
|
spaCy | >=3.7.0.dev0,<3.8.0
|
Default Pipeline | transformer , morphologizer , parser , attribute_ruler , ner
|
Components | transformer , morphologizer , parser , attribute_ruler , ner
|
Vectors | 0 keys, 0 unique vectors (0 dimensions) |
Sources | UD Japanese GSD v2.8 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel) UD Japanese GSD v2.8 NER (Megagon Labs Tokyo) cl-tohoku/bert-base-japanese-char-v2 (Inui Laboratory, Tohoku University) |
License | CC BY-SA 3.0
|
Author | Explosion |
Model size | 320 MB |
Label Scheme
View label scheme (64 labels for 3 components)
Component | Labels |
---|---|
morphologizer
| POS=NOUN , POS=ADP , POS=VERB , POS=SCONJ , POS=AUX , POS=PUNCT , POS=PART , POS=DET , POS=NUM , POS=ADV , POS=PRON , POS=ADJ , POS=PROPN , POS=CCONJ , POS=SYM , POS=NOUN|Polarity=Neg , POS=AUX|Polarity=Neg , POS=INTJ , POS=SCONJ|Polarity=Neg
|
parser
| ROOT , acl , advcl , advmod , amod , aux , case , cc , ccomp , compound , cop , csubj , dep , det , dislocated , fixed , mark , nmod , nsubj , nummod , obj , obl , punct
|
ner
| CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , MOVEMENT , NORP , ORDINAL , ORG , PERCENT , PERSON , PET_NAME , PHONE , PRODUCT , QUANTITY , TIME , TITLE_AFFIX , WORK_OF_ART
|
Accuracy
Type | Score |
---|---|
TOKEN_ACC
| 99.37 |
TOKEN_P
| 97.65 |
TOKEN_R
| 97.90 |
TOKEN_F
| 97.77 |
POS_ACC
| 97.96 |
MORPH_ACC
| 0.00 |
MORPH_MICRO_P
| 34.01 |
MORPH_MICRO_R
| 98.04 |
MORPH_MICRO_F
| 50.51 |
SENTS_P
| 95.57 |
SENTS_R
| 97.83 |
SENTS_F
| 96.69 |
DEP_UAS
| 93.32 |
DEP_LAS
| 92.26 |
TAG_ACC
| 97.13 |
LEMMA_ACC
| 96.71 |
ENTS_P
| 84.45 |
ENTS_R
| 82.64 |
ENTS_F
| 83.53 |
Installation
pip install spacy
python -m spacy download ja_core_news_trf