UD Thai PUD |
This is a part of the Parallel Universal Dependencies (PUD) treebanks created for the CoNLL 2017 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. |
1,000 sentences |
CC BY-SA 3.0 |
Universal Dependencies |
GitHub |
Thai Treebanks Dataset (thtb) |
To enable research oppotunities with very few Thai Computational Linguitic resources, we willingly introduce fundamental high-level language resouces built with passion, Thai Treebanks, build from scratch for researchers and enthusiasts. |
5,200 sentences |
CC BY 4.0 |
Pechlada Seenual, Thodsaporn Chay-intr and Thanaruk Theeramunkong |
GitHub |
Blackboard Treebank |
Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. |
122,851 clauses (38,558 sentences) |
CC BY 3.0 |
Prachya Boonkwan, NECTEC |
bitbucket |
Thai Universal Dependency Treebank (TUD) |
Thai Universal Dependency Treebank, consisting of 3,627 trees annotated in accordance with the Universal Dependencies (UD) framework. |
3,627 trees |
|
Chulalongkorn University |
GitHub |
Thai Discourse Treebank |
Thai Discourse Treebank is the first and largest Thai corpus annotated with explicit discourse relations in the style of the English Penn Discourse Treebank 3 scheme. The final corpus consists of 10,602 sentences from 384 documents, 180 of which have complete annotation of discourse connectives and its two argument spans. |
|
|
Ponrawee Prasertsom, Apiwat Jaroonpol, Attapol T. Rutherford |
GitHub |