Optical Character Recognition

Corpus

Name Description Size License Creator Download
KVIS Thai OCR Dataset Offline Thai Handwritten Character Dataset CC BY 4.0 John Joseph, Ferdin Joe Website
Thai OCR Thai ocr dataset from NECTEC Training set: 81,100 image CC BY-SA-NC 3.0 NECTEC aiforthai (registration required)
Thai handwriting number dataset Create Thai handwriting number dataset MIT @kittinan GitHub

Software

Name Description Status Language License
Tesseract OCR Tesseract Open Source OCR Engine active C/C++ Apache License 2.0
Easy OCR Ready-to-use OCR with 40+ languages supported including Chinese, Japanese, Korean and Thai. active Python 3.X Apache License 2.0
Thai National Document Optical Character Recognition (THND OCR) Tesseract OCR tools for read Thai National Document used TH Sarabun National Font trained and fine-tuned. Read README.md to see about my process. active Python 3.X