prachathai-67k |
News Article Corpus from Prachathai.com |
67,889 articles wtih 51,797 tags |
12 |
CC BY 4.0 |
@lukkiddd and @cstorm125 |
GitHub |
wisesight sentiment |
Social media messages in Thai language with sentiment label (positive, neutral, negative, question). |
26,737 messages |
4 |
CC0-1.0 License |
Arthit Suriyawongkul, Ekapol Chuangsuwanich |
GitHub |
wongnai corpus |
This project is a collection of Wongnai's datasets which are mostly in Thai language. |
500K words labeled |
5 |
LGPL-3.0 License |
wongnai |
GitHub |
Toxicity in Thai Tweet Corpus |
Toxicity in Thai Tweet Corpus |
3,300 messages |
2 |
CC BY-NC 4.0 |
Tokyo Metropolitan University Natural Language Processing Group |
GitHub |
Thai-Clickbait |
The dataset for Thai Clickbait classification |
train: 37,376 messages, test: 9,344 messages |
1 |
MIT License |
@9meo at GitHub |
GitHub |
sentiment_analysis_thai |
Thai sentiment analysis from @JagerV3 |
|
2 |
? |
@JagerV3 at GitHub |
GitHub |
thai-emojification |
Emojification of Thai Text, Using Deep Learning (LSTM). |
train: 128 messages, test: 55 messages |
5 (❤️😄😞🍴⚾) |
GPL-3.0 License |
iApp Technology Co, Ltd |
GitHub |
The 40 Thai Children Stories |
The dataset was collected from 40 Thai children stories. We manually split the text into sentences which leads to 1,964 sentences |
1,964 sentences |
3 |
? |
Kitsuchart Pasupa, Thititorn Seneewong Na Ayutthaya |
GitHub |
Thai sentiment analysis dataset |
Thai sentiment analysis dataset from PyThaiNLP |
|
2 |
CC BY 3.0 |
PyThaiNLP |
GitHub |
LimeSoda: Dataset for Fake News Detection in Healthcare Domain |
Thai fake news dataset in the healthcare domain consisting of curate and manually annotated 7,191 documents |
annotated 7,191 documents |
3 (fact, fake, or undefined) |
CC-BY-4.0 License |
Payoungkhamdee, Patomporn and Porkaew, Peerachet and Sinthunyathum, Atthasith and Songphum, Phattharaphon and Kawidam, Witsarut and Loha-Udom, Wichayut and Boonkwan, Prachya and Sutantayawalee, Vipas |
GitHub |
krathu-500 |
A dataset of post-comment on Pantip, a popular Thai web board. |
|
3 (Positive, Negative, and Neutral) |
|
|
GitHub |
thai_cyberbullying_lgbt |
LGBT Cyberbullying Detection in Thai Language Utilizing Transformers-Based Algorithms |
|
|
|
|
GitHub |