Adaptive Hindi OCR Using Generalized Hausdorff Image Comparison
dc.contributor.author | Ma, Huanfeng | en_US |
dc.contributor.author | Doermann, David | en_US |
dc.date.accessioned | 2004-05-31T23:31:43Z | |
dc.date.available | 2004-05-31T23:31:43Z | |
dc.date.created | 2003-08 | en_US |
dc.date.issued | 2003-09-25 | en_US |
dc.description.abstract | In this paper, we present an adaptive Hindi OCR using generalized Hausdor image comparison implemented as part of a rapidly retargetable language tool reort. The system includes: script identification, character segmentation, training sample creation and character recognition. The OCR design (completed in one month) was applied to a complete Hindi-English bilingual dictionary (with 1083 pages) and a collection of ideal images extracted from Hindi documents in PDF format. Experimental results show the recognition accuracy can reach 88% for noisy images and 95% for ideal images, both at the character level. The presented method can also be extended to design OCR systems for different scripts. (LAMP-TR-105) (CAR-TR-987) (UMIACS-TR-2003-87) | en_US |
dc.format.extent | 528049 bytes | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://hdl.handle.net/1903/1307 | |
dc.language.iso | en_US | |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | en_US |
dc.relation.isAvailableAt | University of Maryland (College Park, Md.) | en_US |
dc.relation.isAvailableAt | Tech Reports in Computer Science and Engineering | en_US |
dc.relation.isAvailableAt | UMIACS Technical Reports | en_US |
dc.relation.ispartofseries | UM Computer Science Department; CS-TR-4519 | en_US |
dc.relation.ispartofseries | LAMP-TR-105 | en_US |
dc.relation.ispartofseries | CAR-TR-987 | en_US |
dc.relation.ispartofseries | UMIACS; UMIACS-TR-2003-87 | en_US |
dc.title | Adaptive Hindi OCR Using Generalized Hausdorff Image Comparison | en_US |
dc.type | Technical Report | en_US |
Files
Original bundle
1 - 1 of 1