Thai-NNER (Thai Nested Named Entity Recognition Corpus) |
This work presents the first Thai Nested Named Entity Recognition (N-NER) dataset. Thai N-NER consists of 264,798 mentions, 104 classes, and a maximum depth of 8 layers obtained from news articles and restaurant reviews, a total of 4894 documents. Our work, to the best of our knowledge, presents the largest non-English N-NER dataset and the first non-English one with fine-grained classes. |
|
CC-BY-SA 3.0 |
IST, VISTEC |
GitHub |
นัชชา ถิระสาโรช |
corpora by Wirote Aroonmanakun's students |
|
? |
นัชชา ถิระสาโรช |
นัชชา ถิระสาโรช Data |
ศศิวิมล กาลันสีมา |
corpora by Wirote Aroonmanakun's students |
|
? |
ศศิวิมล กาลันสีมา |
ศศิวิมล กาลันสีมา Data |
ณัฐดาพร เลิศชีวะ |
corpora by Wirote Aroonmanakun's students |
|
? |
ณัฐดาพร เลิศชีวะ |
ณัฐดาพร เลิศชีวะ Data |
Thai NER |
Thai NER project is part of PyThaiNLP. |
|
CC BY 3.0 |
Wannaphong Phatthiyaphaibun |
GitHub |
THAI-NEST |
Thai Named Entity tagging Corpus from NECTEC & Thammasat University |
|
CC BY-SA-NC 3.0 |
NECTEC |
aiforthai (registration required) |
WikiANN |
WikiANN (sometimes called PAN-X) is a multilingual named entity recognition dataset consisting of Wikipedia articles annotated with LOC (location), PER (person), and ORG (organisation) tags in the IOB2 format. |
|
|
Rahimi, Afshin and Li, Yuan and Cohn, Trevor |
GitHub |
Crime Named Entity Recognition |
NER project with Thai crime news dataset |
|
|
|
GitHub |