CUR Matrix Approximation Through Convex Optimization

Linehan, Kathryn

CUR Matrix Approximation Through Convex Optimization

dc.contributor.advisor	Balan, Radu V	en_US
dc.contributor.author	Linehan, Kathryn	en_US
dc.contributor.department	Applied Mathematics and Scientific Computation	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2024-06-29T06:00:30Z
dc.date.available	2024-06-29T06:00:30Z
dc.date.issued	2024	en_US
dc.description.abstract	In this dissertation we present work on the CUR matrix approximation. Specifically, we present 1) an approximation of the proximal operator of the L-infinity norm using a neural network, 2) a novel deterministic CUR formulation and algorithm, and 3) a novel application of CUR as a feature selection method to determine discriminant proteins when clustering protein expression data in a self-organizing map (SOM). The proximal operator of the L-infinity norm arises in our CUR algorithm. Since the computation of the proximal operator of the L-infinity norm requires a sort of the input data (or at least a partial sort similar to quicksort), we present a neural network to approximate the proximal operator. A novel aspect of the network is that it is able to accept vectors of varying lengths due to a feature selection process that uses moments of the input data. We present results on the accuracy of the approximation, feature importance, and computational efficiency of the approach, and present an algorithm to calculate the proximal operator of the L-infinity norm exactly, relate it to the Moreau decomposition, and compare its computational efficiency to that of the approximation. Next, we present a novel deterministic CUR formulation that uses convex optimization to form the matrices C and R, and a corresponding algorithm that uses bisection to ensure that the user selected number of columns appear in C and the user selected number of rows appear in R. We implement the algorithm using the surrogate functional technique of Daubechies et al. [Communications on Pure and Applied Mathematics, 57.11 (2004)] and extend the theory of this approach to apply to our CUR formulation. Numerical results are presented that demonstrate the effectiveness of our CUR algorithm as compared to the singular value decomposition (SVD) and other CUR algorithms. Last, we use our CUR approximation as a feature selection method in the application by Higuera et al. [PLOS ONE, 10(6) (2015)] to determine discriminant proteins when clustering protein expression data in an SOM. This is a novel application of CUR and to the best of our knowledge, this is the first use of CUR on protein expression data. We compare the performance of our CUR algorithm to other CUR algorithms and the Wilcoxon rank-sum test (the original feature selection method in the work).	en_US
dc.identifier	https://doi.org/10.13016/tuk7-1ihu
dc.identifier.uri	http://hdl.handle.net/1903/32931
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Applied mathematics	en_US
dc.subject.pquncontrolled	convex optimization	en_US
dc.subject.pquncontrolled	CUR matrix approximation	en_US
dc.subject.pquncontrolled	feature selection	en_US
dc.subject.pquncontrolled	proximal operator	en_US
dc.title	CUR Matrix Approximation Through Convex Optimization	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Linehan_umd_0117E_24208.pdf
Size:: 3.1 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations
Mathematics Theses and Dissertations