| Orchid Corpus | Thai part of speech (POS) tagged corpus | 5,200 sentences | CC BY-SA-NC 4.0 | NECTEC | Mirror from @wannaphong | 
| Blackboard Treebank | Blackboard Treebank is a Thai dependency corpus based on the LST20 Annotation Guideline. It features dependency structures, constituency structures, word boundaries, named entities, clause boundaries, and sentence boundaries. | 122,851 clauses (38,558 sentences) | CC BY 3.0 | Prachya Boonkwan, NECTEC | bitbucket | 
| UD Thai PUD | This is a part of the Parallel Universal Dependencies (PUD) treebanks created for the CoNLL 2017 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. | 1,000 sentences | CC BY-SA 3.0 | Universal Dependencies | GitHub | 
| thai-political-tweets | A small Thai political twitter dataset with UD POS tags | 41 tweets, 965 words | Unlicense License | Can Udomcharoenchaikit | GitHub | 
| Thai Universal Dependency Treebank (TUD) | Thai Universal Dependency Treebank, consisting of 3,627 trees annotated in accordance with the Universal Dependencies (UD) framework. | 3,627 trees |  | Chulalongkorn University | GitHub | 
| Thai Discourse Treebank | Thai Discourse Treebank is the first and largest Thai corpus annotated with explicit discourse relations in the style of the English Penn Discourse Treebank 3 scheme. The final corpus consists of 10,602 sentences from 384 documents, 180 of which have complete annotation of discourse connectives and its two argument spans. |  |  | Ponrawee Prasertsom, Apiwat Jaroonpol, Attapol T. Rutherford | GitHub |