University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Processing Camera-captured Document Images: Geometric Rectification, Mosaicing, and Layout Structure Recognition

    Thumbnail
    View/Open
    umi-umd-3588.pdf (17.00Mb)
    No. of downloads: 1407

    Date
    2006-06-12
    Author
    Liang, Jian
    Advisor
    Chellappa, Ramalingam
    Metadata
    Show full item record
    Abstract
    This dissertation explores three topics: 1) geometric rectification of cameracaptured document images, 2) camera-captured document mosaicing, and 3) layout structure recognition. The first two topics pertain to camera-based document image analysis, a new trend within the OCR community. Compared to typical scanners,cameras offer convenient, flexible, portable, and non-contact image capture, which enables many new applications and breathes new life into existing ones. The third topic is related to the need for efficient metadata extraction methods, critical for managing digitized documents. The kernel of our geometric rectification framework is a novel method for estimating document shape from a single camera-captured image. Our method uses texture flows detected in printed text areas and is insensitive to occlusion. Classification of planar versus curved documents is done automatically. For planar pages, we obtain full metric rectification. For curved pages, we estimate a planar-strip approximation based on properties of developable surfaces. Our method can process any planar or smoothly curved document captured from an arbitrary position without requiring 3D data, metric data, or camera calibration. For the second topic, we design a novel registration method for document images, which produces good results in difficult situations including large displacements, severe projective distortion, small overlapping areas, and lack of distinguishable feature points. We implement a selective image composition method that outperforms conventional image blending methods in overlapping areas. It eliminates double images caused by mis-registration and preserves the sharpness in overlapping areas. We solve the third topic with a graph-based model matching framework. Layout structures are modeled by graphs, which integrate local and global features and are extensible to new features in the future. Our model can handle large variation within a class and subtle differences between classes. Through graph matching, the layout structure of a document is discovered. Our layout structure recognition technique accomplishes document classification and logical component labeling at the same time. Our model learning method enables a model to adapt to changes in classes over time.
    URI
    http://hdl.handle.net/1903/3659
    Collections
    • Electrical & Computer Engineering Theses and Dissertations
    • UMD Theses and Dissertations

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility