Optical Character Recognition

Corpus

Name Description Size License Creator Download
KVIS Thai OCR Dataset Offline Thai Handwritten Character Dataset CC BY 4.0 John Joseph, Ferdin Joe Website
Thai OCR Thai ocr dataset from NECTEC Training set: 81,100 image CC BY-SA-NC 3.0 NECTEC aiforthai (registration required)

Software

Name Description Status Language License
Tesseract OCR Tesseract Open Source OCR Engine active C/C++ Apache License 2.0
Easy OCR Ready-to-use OCR with 40+ languages supported including Chinese, Japanese, Korean and Thai. active Python 3.X Apache License 2.0