Data Representation for Learning and Information Fusion in Bioinformatics

Rajapakse, Vinodh Nalin

Data Representation for Learning and Information Fusion in Bioinformatics

Files

Rajapakse_umd_0117E_14440.pdf (6.25 MB)

No. of downloads: 11813

Date

2013

Authors

Rajapakse, Vinodh Nalin

Advisor

Czaja, Wojciech

Abstract

This thesis deals with the rigorous application of nonlinear dimension reduction and data organization techniques to biomedical data analysis. The Laplacian Eigenmaps algorithm is representative of these methods and has been widely applied in manifold learning and related areas. While their asymptotic manifold recovery behavior has been well-characterized, the clustering properties of Laplacian embeddings with finite data are largely motivated by heuristic arguments. We develop a precise bound, characterizing cluster structure preservation under Laplacian embeddings. From this foundation, we introduce flexible and mathematically well-founded approaches for information fusion and feature representation. These methods are applied to three substantial case studies in bioinformatics, illustrating their capacity to extract scientifically valuable information from complex data.

URI (handle)

http://hdl.handle.net/1903/14492

Collections

UMD Theses and Dissertations
Mathematics Theses and Dissertations

Full item page