UMD Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/3
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
2 results
Search Results
Item CUR Matrix Approximation Through Convex Optimization(2024) Linehan, Kathryn; Balan, Radu V; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this dissertation we present work on the CUR matrix approximation. Specifically, we present 1) an approximation of the proximal operator of the L-infinity norm using a neural network, 2) a novel deterministic CUR formulation and algorithm, and 3) a novel application of CUR as a feature selection method to determine discriminant proteins when clustering protein expression data in a self-organizing map (SOM). The proximal operator of the L-infinity norm arises in our CUR algorithm. Since the computation of the proximal operator of the L-infinity norm requires a sort of the input data (or at least a partial sort similar to quicksort), we present a neural network to approximate the proximal operator. A novel aspect of the network is that it is able to accept vectors of varying lengths due to a feature selection process that uses moments of the input data. We present results on the accuracy of the approximation, feature importance, and computational efficiency of the approach, and present an algorithm to calculate the proximal operator of the L-infinity norm exactly, relate it to the Moreau decomposition, and compare its computational efficiency to that of the approximation. Next, we present a novel deterministic CUR formulation that uses convex optimization to form the matrices C and R, and a corresponding algorithm that uses bisection to ensure that the user selected number of columns appear in C and the user selected number of rows appear in R. We implement the algorithm using the surrogate functional technique of Daubechies et al. [Communications on Pure and Applied Mathematics, 57.11 (2004)] and extend the theory of this approach to apply to our CUR formulation. Numerical results are presented that demonstrate the effectiveness of our CUR algorithm as compared to the singular value decomposition (SVD) and other CUR algorithms. Last, we use our CUR approximation as a feature selection method in the application by Higuera et al. [PLOS ONE, 10(6) (2015)] to determine discriminant proteins when clustering protein expression data in an SOM. This is a novel application of CUR and to the best of our knowledge, this is the first use of CUR on protein expression data. We compare the performance of our CUR algorithm to other CUR algorithms and the Wilcoxon rank-sum test (the original feature selection method in the work).Item MACHINERY ANOMALY DETECTION UNDER INDETERMINATE OPERATING CONDITIONS(2018) Tian, Jing; Pecht, Michael; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Anomaly detection is a critical task in system health monitoring. Current practice of anomaly detection in machinery systems is still unsatisfactory. One issue is with the use of features. Some features are insensitive to the change of health, and some are redundant with each other. These insensitive and redundant features in the data mislead the detection. Another issue is from the influence of operating conditions, where a change in operating conditions can be mistakenly detected as an anomalous state of the system. Operating conditions are usually changing, and they may not be readily identified. They contribute to false positive detection either from non-predictive features driven by operating conditions, or from influencing predictive features. This dissertation contributes to the reduction of false detection by developing methods to select predictive features and use them to span a space for anomaly detection under indeterminate operating conditions. Available feature selection methods fail to provide consistent results when some features are correlated. A method was developed in this dissertation to explore the correlation structure of features and group correlated features into the same clusters. A representative feature from each cluster is selected to form a non-correlated set of features, where an optimized subset of predictive features is selected. After feature selection, the influence of operating conditions through non-predictive variables are removed. To remove the influence on predictive features, a clustering-based anomaly detection method is developed. Observations are collected when the system is healthy, and these observations are grouped into clusters corresponding to the states of operating conditions with automatic estimation of clustering parameters. Anomalies are detected if the test data are not members of the clusters. Correct partitioning of clusters is an open challenge due to the lack of research on the clustering of the machinery health monitoring data. This dissertation uses unimodality of the data as a criterion for clustering validation, and a unimodality-based clustering method is developed. Methods of this dissertation were evaluated by simulated data, benchmark data, experimental study and field data. These methods provide consistent results and outperform representatives of available methods. Although the focus of this dissertation is on the application of machinery systems, the methods developed in this dissertation can be adapted for other application scenarios for anomaly detection, feature selection, and clustering.