Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
31 results
Search Results
Item Developing and Measuring Latent Constructs in Text(2024) Hoyle, Alexander Miserlis; Resnik, Philip; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Constructs---like inflation, populism, or paranoia---are of fundamental concern to social science. Constructs are the vocabulary over which theory operates, and so a central activity is the development and measurement of latent constructs from observable data. Although the social sciences comprise fields with different epistemological norms, they share a concern for valid operationalizations that transparently map between data and measure. Economists at the US Bureau of Labor Statistics, for example, follow a hundred-page handbook to sample the egg prices that constitute the Consumer Price Index; Clinical psychologists rely on suites of psychometric tests to diagnose schizophrenia. In many fields, this observable data takes the form of language: as a social phenomenon, language data can encode many of the latent social constructs that people care about. Commensurate with both increasing sophistication in language technologies and amounts of available data, there has thus emerged a "text-as-data" paradigm aimed at "amplifying and augmenting" the analyses that compose research. At the same time, Natural Language Processing (NLP), the field from which analysis tools originate, has often remained separate from real-world problems and guiding theories---as least when it comes to social science. Instead, it focuses on atomized tasks under the assumption that progress on low-level language aspects will generalize to higher-level problems that involve overlapping elements. This dissertation focuses on NLP methods and evaluations that facilitate the development and measurement of latent constructs from natural language, while remaining sensitive to social sciences' need for interpretability and validity.Item TOWARDS FULLY AUTOMATED ENHANCED SAMPLING OF NUCLEATION WITH MACHINE-LEARNING METHODS(2024) Zou, Ziyue; Tiwary, Pratyush; Chemistry; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Molecular dynamics (MD) simulation has become a powerful tool to model complex molecular dynamics in physics, materials science, biology, and many other fields of study as it is advantageous in providing temporal and spatial resolutions. However, phenomena of common research interest are often considered rare events, such as nucleation, protein conformational changes, and ligand binding, which occur on timescales far beyond what brute-force all-atom MD simulations can achieve within practical computer time. This makes MD simulation difficult for studying the thermodynamics and kinetics of rare events. Therefore, it is a common practice to employ enhanced sampling techniques to accelerate the sampling of rare events. Many of these methods require performing dimensionality reduction from atomic coordinates to a low-dimensional representation that captures the key information needed to describe such transitions. To better understand the current challenges in studying crystal nucleation with computer simulations, the goal is to first apply developed dimensionality reduction methods to such systems. Here, I will present two studies on applying different machine learning (ML) methods to the study of crystal nucleation under different conditions, i.e., in vacuum and in solution. I investigated how such meaningful low-dimensional representations, termed reaction coordinates (RCs), were constructed as linear or non-linear combinations of features. Using these representations along with enhanced sampling methods, I achieved robust state-to-state back-and-forth transitions. In particular, I focused on the case of urea molecules, a small molecule composed of 8 atoms, which can be easily sampled and is commonly used in daily practice as fertilizer in agriculture and as a nitrogen source in organic synthesis. I then analyzed my samples and benchmarked them against other experimental and computational studies. Given the challenges in studying crystal nucleation using molecular dynamics simulations, I aim to introduce new methods to facilitate research in this field. In the second half of the dissertation, I focused on presenting novel methods to learn low-dimensional representations directly from atomic coordinates without the aid of a priori known features, utilizing advanced machine learning techniques. To test my methods, I applied them to several representative model systems, including Lennard Jones 7 clusters, alanine dipeptide, and alanine tetrapeptide. The first system is known for its well-documented dynamics in colloidal rearrangements relevant to materials science studies, while the latter two systems represent problems related to conformational changes in biophysical studies. Beyond model systems, I also applied my methods to more complex physical systems in the field of materials science, specifically iron atoms and glycine molecules. Notably, the enhanced sampling method integrated with my approaches successfully sampled robust state-to-state transitions between allotropes of iron and polymorphs of glycine.Item Dynamic EM Ray Tracing for Complex Outdoor and Indoor Environments with Multiple Receivers(2024) Wang, Ruichen; Manocha, Dinesh; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Ray tracing models for visual, aural, and EM simulations have advanced, gaining traction in dynamic applications such as 5G, autonomous vehicles, and traffic systems. Dynamic ray tracing, modeling EM wave paths and their interactions with moving objects, leads to many challenges in complex urban areas due to environmental variability, data scarcity, and computational needs. In response to these challenges, we've developed new methods that use a dynamic coherence-based approach for ray tracing simulations across EM bands. Our approach is designed to enhance efficiency by improving the recomputation of bounding volume hierarchy (BVH) and by caching propagation paths. With our formulation, we've observed a reduction in computation time by about 30%, all while maintaining a level of accuracy comparable to that of other simulators. Building on our dynamic approach, we've made further refinements to our algorithm to better model channel coherence, spatial consistency, and the Doppler effect. Our EM ray tracing algorithm can incrementally improve the accuracy of predictions relating to the movement and positioning of dynamic objects in the simulation. We've also integrated the Uniform Geometrical Theory of Diffraction (UTD) with our ray tracing algorithm. Our enhancement is designed to allow for more accurate simulations of diffraction around smooth surfaces, especially in complex indoor settings, where accurate prediction is important. Taking another step forward, we've combined machine learning (ML) techniques with our dynamic ray tracing framework. Leveraging a modified conditional Generative Adversarial Network (cGAN) that incorporates encoded geometry and transmitter location, we demonstrate better efficiency and accuracy of simulations in various indoor environments with 5X speedup. Our method aims to not only improve the prediction of received power in complex layouts and reduce simulation times but also to lay a groundwork for future developments in EM simulation technologies, potentially including real-time applications in 6G networks. We evaluate the performance of our methods in various environments to highlight the advantages. In dynamic urban scenes, we demonstrate our algorithm’s scalability to vast areas and multiple receivers with maintained accuracy and efficiency compared to prior methods; for complex geometries and indoor environments, we compare the accuracy with analytical solutions as well as existing EM ray tracing systems.Item Analyzing and Enhancing Molecular Dynamics Through the Synergy of Physics and Artificial Intelligence(2024) Wang, Dedi; Tiwary, Pratyush; Biophysics (BIPH); Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Rapid advances in computational power have made all-atom molecular dynamics (MD) a powerful tool for studying systems in biophysics, chemical physics and beyond. By solving Newton's equations of motion in silico, MD simulations allow us to track the time evolution of complex molecular systems in an all-atom, femtosecond resolution, enabling the evaluation of both their thermodynamic and kinetic properties. Though MD simulations are powerful, their effectiveness is often hampered by the large amount of data they produce. For instance, a standard microsecond-long simulation of a protein can easily generate hundreds of gigabytes of data, which can be difficult to analyze. Moreover, the time required to conduct these simulations can be prohibitively long. Microsecond-long simulations often take weeks to complete, whereas the processes of interest may occur on the timescale of milliseconds or even hundreds of seconds. These factors collectively pose significant challenges in leveraging MD simulations for comprehensive analysis and exploration of chemical and biological systems. In this thesis, I address these challenges by leveraging physics-inspired insights to learn unique, useful, and also meaningful low-dimensional representations of complex molecular systems. These representations enable effective analysis and interpretation of the vast amount of data generated from experiments and simulations. These representations have proven to be valuable in providing mechanistic insights into some fundamental problems within theoretical chemistry and biophysics, such as understanding the interplay between long-range and short-range forces in ion pair dissociation and the transformation of proteins from unstable random coils to structured forms. Furthermore, these physics-informed representations play a crucial role in enhancing MD simulations. They facilitate the construction of simplified kinetic models, enabling the generation of dynamical trajectories spanning significantly longer time scales than those accessible by conventional MD simulations. Additionally, they can serve as blueprints to guide the sampling process in combination with existing enhanced sampling methods. Through this thesis, I showcase how the synergy between physics and AI can advance our understanding of molecular systems and facilitate more efficient and insightful analysis in the fields of computational chemistry and biophysics.Item Prediction of Marine Timber Pile Damage Ratings Using a Gradient Boosted Regression Model(2023) Willmott, Carly; Attoh-Okine, Nii O.; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Marine pilings are critical structural elements exposed to harsh environmental conditions. Specialized routine inspection and regular maintenance are essential to keep marine facilities in good working condition. These activities generate data that can be exploited for knowledge gain with machine learning tools. A gradient boosted random forest regressor machine learning algorithm, XGBoost, was applied to datasets that contain timber pile underwater inspection and repair data over a period of 23 years. First, the data was visualized to show the longevity of different timber pile repair types. An XGBoost model was then tuned and trained on a dataset for timber piles at one pier. Variables in the dataset were evaluated for feature importance in predicting damage ratings assigned during routine underwater inspections. Next, an ensemble of XGBoost models was trained and applied to a second dataset containing the same features for an adjacent pier. This dataset was reserved for testing to demonstrate whether the ensemble trained on one pier’s data could be generalized to predict timber pile damage ratings at a nearby but separate pier. Finally, the ensemble was used to predict damage ratings on piles that had earlier data but were not rated in the two most recent inspection events. Results suggest that the ensemble is capable of predicting timber pile damage ratings to approximately +/- one damage rating on both the training and test datasets. Feature importances revealed that half of the variables (time since the first event, repair type, exposed pile length, and time since the last repair) contributed to two thirds of the relative importance in predicting damage ratings. Data visualization showed that a few repair types, such as pile replacements and encapsulations, appeared to be most successful over the long term compared with shorter-lived repairs like wraps and encasements. These results are promising indications of the advantages machine learning algorithms can offer in processing and gleaning new insights from structural repair and inspection data. Economic benefits to marine facility owners can potentially be realized through earlier anticipation of repairs and more targeted inspection and rehabilitation efforts. There are also opportunities for better understanding of deterioration rates if more data is gathered over the lifespans of structures, as well as more detailed data that can be introduced as new features.Item The Limitations of Deep Learning Methods in Realistic Adversarial Settings(2023) Kaya, Yigitcan; Dumitras, Tudor; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The study of adversarial examples has evolved from a niche phenomenon to a well-established branch of machine learning (ML). In the conventional view of an adversarial attack, the adversary takes an input sample, e.g., an image of a dog, and applies a deliberate transformation to this input, e.g., a rotation. This then causes the victim model to abruptly change its prediction, e.g., the rotated image is classified as a cat. Most prior work has adapted this view across different applications and provided powerful attack algorithms as well as defensive strategies to improve robustness. The progress in this domain has been influential for both research and practice and it has produced a perception of better security. Yet, security literature tells us that adversaries often do not follow a specific threat model and adversarial pressure can exist in unprecedented ways. In this dissertation, I will start from the threats studied in security literature to highlight the limitations of the conventional view and extend it to capture realistic adversarial scenarios. First, I will discuss how adversaries can pursue goals other than hurting the predictive performance of the victim. In particular, an adversary can wield adversarial examples to perform denial-of-service against emerging ML systems that rely on input-adaptiveness for efficient predictions. Our attack algorithm, DeepSloth, can transform the inputs to offset the computational benefits of these systems. Moreover, an existing conventional defense is ineffective against DeepSloth and poses a trade-off between efficiency and security. Second, I will show how the conventional view leads to a false sense of security for anomalous input detection methods. These methods build modern statistical tools around deep neural networks and have shown to be successful in detecting conventional adversarial examples. As a general-purpose analogue of blending attacks in security literature, we introduce the Statistical Indistinguishability Attack (SIA). SIA bypasses a range of published detection methods by producing anomalous samples that are statistically similar to normal samples. Third, and finally, I will focus on malware detection with ML, a domain where adversaries gain leverage over ML naturally without deliberately perturbing inputs like in the conventional view. Security vendors often rely on ML for automating malware detection due to the large volume of new malware. A standard approach for detection is collecting runtime behaviors of programs in controlled environments (sandboxes) and feeding them to an ML model. I have first observed that a model trained using this approach performs poorly when it is deployed on program behaviors from realistic, uncontrolled environments, which gives malware authors an advantage in causing harm. We attribute this deterioration to distribution shift and investigate possible improvements by adapting modern ML techniques, such as distributionally robust optimization. Overall, my dissertation work has reinforced the importance of considering comprehensive threat models and applications with well-documented adversaries for properly assessing the security risks of ML.Item SUSTAINABILITY, ACCEPTANCE RISK ANALYSIS AND MACHINE LEARNING IN ASSESSING MECHANICAL PROPERTIES AND THE IMPACT OF HIGHWAY MATERIALS IN TRANSPORTATION INFRASTRUCTURE(2023) Zhao, Yunpeng; Goulias, Dimitrios G; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Improving the performance and extending the service life of transportation infrastructure is a long standing goal of Federal Highway Administration (FHWA) and the transportation community. Accurate prediction of the mechanical properties of highway materials are indispensable for enhancing the sustainability and resilience of transportation infrastructure since it provides accurate inputs for pavement mechanistic-empirical (ME) design and prediction of pavement distresses, helping to optimally allocate the maintenance needs and reduce testing frequencies which account for costly expenditures. Accurate prediction of materials properties can also reduce the acceptance risks during quality assurance (QA) without conducting extensive testing. Concrete plays an important role in the construction of transportation infrastructure. Developing an empirical and/or statistical model for accurately predicting compressive strength remains challenging and requires extensive experimental work. Thus, the objective of the study was to improve the prediction of concrete compressive strength using ML algorithms. A ML pipeline was proposed in which a two-layer stacked model was developed by combining seven individual ML models. Feature engineering was implemented, and feature importance was evaluated to provide better interpretability of the data and the model. This study promotes a more thorough assessment of alternative ML algorithms for predicting material properties. In addition, the quality of highway materials and construction translate directly to performance. To develop a statistically sound QA specification, the risks to the agency and contractor must be well understood. In this study, a Monte Carlo simulation model was developed to systematically assess the acceptance risks and the implications on pay factors (PF). The simulation was conducted using typical acceptance quality characteristics (AQCs), such as strength, for Portland cement (PCC) pavements. The analysis indicated that specific combinations of contractor and agency sample sizes and population characteristics have a greater impact on acceptance risks and may provide inconsistent PF. The proposed methodology aids both agencies and producers to better understand and evaluate the impact of sample sizes and population characteristics on the acceptance risks and PF. Finally, the use of recycled materials is a key element in generating sustainable pavement designs to save natural resources, reduce energy, greenhouse gas (GHG) emissions and costs. This study proposed a methodological life cycle assessment (LCA) framework to quantify the environmental and economic impacts of using recycled materials in pavement construction and rehabilitation. The LCA was conducted on two roadway projects with innovative recycled materials, such as construction and demolition waste (CDW) and rock dust. The proposed LCA framework can be used elsewhere to quantify the environmental and economic benefits of using recycled materials in pavements.Item Exploiting Causal Reasoning to Improve the Quantitative Risk Assessment of Engineering Systems Through Interventions and Counterfactuals(2023) Ruiz-Tagle, Andres; Groth, Katrina; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The main strength of quantitative risk assessment (QRA) is to enable risk management by providing causal insights into the risk of an engineering system or process. Bayesian Networks (BNs) have become popular in QRA because they offer an improved causal structure that represents analysts’ knowledge of a system and enable reasoning under uncertainty. Currently, the use of BNs for risk-informed decisions is based solely on associative reasoning, answering questions of the form "If we observe X=x, how likely is it to observe Y=y?” However, risk management in the industry relies on understanding how a system could change in response to external influences (e.g., interventions and decisions) and identifying the causes and mechanisms that could explain the outcome of past events (e.g., accident investigations and lessons learned). This dissertation shows that associative reasoning alone is insufficient to provide these insights, and it provides a framework for obtaining more complex causal insight using BNs with intervention and counterfactual reasoning. Intervention and counterfactual reasoning must be implemented along with BNs to provide more complex insights about the risk of a system. Intervention reasoning answers queries of the form "How does doing X=x change the likelihood of observing Y=y?” and can be used to inform the causal effect of interventions and decisions on the risk and reliability of a system. Counterfactual reasoning answers queries of the form "Had X been X=x' in an event, instead of the observed X=x, could Y have been Y=y', instead of the observed Y=y?” and can be used to learn from past events and improve safety management activities. BNs present a unique opportunity as a risk modeling approach that incorporates the complex causal dependencies present in a system’s variables and allows reasoning under uncertainty. Therefore, exploiting the causal reasoning capabilities of BNs in QRAs can be highly beneficial to improve modern risk analysis. The goal of this work is to define how to exploit the causal reasoning capabilities of BNs to support intervention and counterfactual reasoning in the QRA of complex systems and processes. To achieve this goal, this research first establishes the mathematical background and methods required to model interventions and counterfactuals within a BN approach. Then, we demonstrate the proposed methods with two case studies concerning the risk of third-party excavation damage to natural gas pipelines in the U.S. The first case study showed that the intervention reasoning methods developed in this work produce unbiased causal insights into the effectiveness of implementing new excavation practices. The second case study showed how the counterfactual reasoning methods developed in this work can expand on the lessons learned from an accident investigation on the Sun Prairie 2018 gas explosion by providing new insights into the effectiveness of current damage prevention practices. Finally, associative, intervention, and counterfactual reasoning methods with BNs were integrated into a single model and used to assess the risk of a highly complex challenge for the future of clean energy: excavation damages to natural gas pipelines transporting hydrogen. The impact of this research is a first-of-its-kind approach and a novel set of QRA methods that provide expanded causal insights for understanding failures and accidents in complex engineering systems and processes.Item Quantitative Motion Analysis of the Upper Limb: Establishment of Normative Kinematic Datasets and Systematic Comparison of Motion Analysis Systems(2022) Wang, Sophie Linyi; Kontson, Kimberly L; White, Ian; Bioengineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Upper limb prosthetic devices with advanced capabilities are currently in development. With these advancements brings to light the importance of objectively and quantitatively measuring effectiveness and benefit of these devices. Recently, the application of motion capture (i.e., digital tracking of upper body movements in space) to performance-based outcome measures has gained traction as a possible tool for human movement assessment that could facilitate optimal device selection, track rehabilitative progress, and inform device regulation and review. While motion capture shows promise, the clinical, regulatory, and industry communities would benefit from access to large clinical and normative datasets from different motion capture systems and a better understanding of advantages and limitations of different motion capture approaches. The first objective of this dissertation is to establish kinematic datasets of normative and upper-limb prosthesis user motion. The normative kinematic distributions of many performance-based outcome measures are not established, and it is difficult to determine departures from normative patterns without relevant clinical expertise. In Specific Aim 1, normative and clinically relevant datasets were created using a gold standard motion capture system to record participants performing standardized tasks from outcome measures. Without kinematic data, it is also difficult to identify informative kinematic features and tasks that exhibit characteristic differences from normative motion. The second objective is to identify salient kinematic characteristics associated with departures from normative motion. In Specific Aim 2, an unsupervised K-means machine learning algorithm was applied to the previously collected data to determine motions and tasks that distinguish between normative and prosthesis user movement. The third objective is to compare three commonly used motion capture systems that vary in motion tracking mechanisms. The most informative tasks and kinematic characteristics previously identified will be used to evaluate the detection of these differences for several motion capture systems with varying tracking methods in Specific Aim 3.Item STRUCTURAL PERFORMANCE ASSESSMENT ON PREVENTIVE MAINTENANCE/REHABILITATION OF STEEL GIRDER BRIDGE SYSTEMS(2022) Zhu, Yifan; Fu, Chung C.; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Bridge maintenance, including preventive maintenance, rehabilitation, and replacement keeps the structure safe in its service life. Bridge maintenance methods have developed and expanded to bridge inspection, bridge condition assessments, structural health monitoring technologies, service life prediction, and maintenance with new materials or technologies. This dissertation proposes two structural performance assessments, (1) a rapid machine learning assessment classifying whether the design is under an acceptable range; and (2) comparing the structural health monitoring data with engineers’ predictions to evaluate the current structural performance.The first part of this dissertation focuses on preventive maintenance evaluation. This part discusses and plans a specific topic to instrument a newly constructed link slab system on a multiple simply-supported bridge. Before any simulation and field test of the general steel I-girder bridge model was conducted, literature regarding bridge maintenance, performance assessment, the durability of bridges, structural health monitoring method, current condition assessment methods, and research on the material and structural behavior of link slabs were reviewed and investigated. Then comprehensive experimental programs on the new materials HPFRC and ECC were conducted, and the extensive hands-on laboratory results update the nonlinearity and accuracy of the structure model. Moreover, a series of data analyses of the current steel bridge in the United States was conducted for further database establishment. Two sets of simulation-based finite element parametric analyses of the bridge with protective maintenance or structural repairing were introduced to generate preferred designs and further be used for rapid performance assessment. The inputs are the configuration of the bridge and the proposed work or deteriorated location, which generate the dataset for the training, validating, and testing of the evaluation model. The resulting regression model allows for quick assessment of the function developed by taking advantage of machine learning. This research verifies the assessment’s results using the Maryland Transportation Authority (MDTA) pilot HPFRC link slab system on the bridge overpass I-95 as a case study by comparing the prediction and actual structure health monitoring data after the assessment’s results were verified, and its performance was evaluated. This dissertation also examines one case from the Maryland State Highway Administration (MDSHA) bridge over the Patapsco River, which has several frozen expansion rocker bearings. This restrained the longitudinal movement of the superstructures due to thermal expansion and contraction. It has partially recovered to its normal condition after repairing and strengthening the pier under this bearing. In this study, we combined the finite element analysis and monitoring data to simulate structural behavior, and evaluate repair work using the proposed methods. Finally, This research evaluates the repaired bridge with partially strengthened structural components (i.e., deck, girders, or piers) and forecasts its wear and tear. In this study, the original and deteriorated bridges were numerically modeled first and simplified to 3D grid models. Then, the current condition rating process was used to determine the structural performance at the element level. After establishing the evaluation criteria and investigating the corresponding condition ratings, the rapid assessment models for the repaired bridges were carried out. The condition prediction relied on historical ratings and (if any) monitoring data.