Adaptive Hindi OCR Using Generalized Hausdorff Image Comparison

dc.contributor.authorMa, Huanfengen_US
dc.contributor.authorDoermann, Daviden_US
dc.date.accessioned2004-05-31T23:31:43Z
dc.date.available2004-05-31T23:31:43Z
dc.date.created2003-08en_US
dc.date.issued2003-09-25en_US
dc.description.abstractIn this paper, we present an adaptive Hindi OCR using generalized Hausdor image comparison implemented as part of a rapidly retargetable language tool reort. The system includes: script identification, character segmentation, training sample creation and character recognition. The OCR design (completed in one month) was applied to a complete Hindi-English bilingual dictionary (with 1083 pages) and a collection of ideal images extracted from Hindi documents in PDF format. Experimental results show the recognition accuracy can reach 88% for noisy images and 95% for ideal images, both at the character level. The presented method can also be extended to design OCR systems for different scripts. (LAMP-TR-105) (CAR-TR-987) (UMIACS-TR-2003-87)en_US
dc.format.extent528049 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/1307
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4519en_US
dc.relation.ispartofseriesLAMP-TR-105en_US
dc.relation.ispartofseriesCAR-TR-987en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2003-87en_US
dc.titleAdaptive Hindi OCR Using Generalized Hausdorff Image Comparisonen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-4519.pdf
Size:
515.67 KB
Format:
Adobe Portable Document Format