Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
8 results
Search Results
Item A Goal, Question, Metric Approach to Coherent Use Integration Within the DevOps Lifecycle(2022) Rassmann, Kelsey Anne; Regli, William; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The development of high-stakes artificial intelligence (AI) technology creates a possibility for disastrous errors of misuse and disuse. Despite these risks, AI still needs to be developed in a timely manner as it has the potential to positively impact users and their surrounding environment. High-stakes AI needs to “move fast” but it must not “break things.” This thesis provides developers with a methodology that will allow them to establish human-AI coherency while maintaining the development speed of the DevOps software development lifecycle. First, I will present a model of the human-machine interaction (HMI) which will motivate a new mode of AI use entitled ‘Coherent Use.’ Then, I will describe a Goal, Question, Metric approach to maximizing Coherent Use which will integrate directly into the DevOps lifecycle. Finally, I will simulate the usage of this template through an existing software product.Item Improving the Usability of Static Analysis Tools Using Machine Learning(2019) Koc, Ugur; Porter, Adam A.; Foster, Jeffrey S.; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Static analysis can be useful for developers to detect critical security flaws and bugs in software. However, due to challenges such as scalability and undecidability, static analysis tools often have performance and precision issues that reduce their usability and thus limit their wide adoption. In this dissertation, we present machine learning-based approaches to improve the adoption of static analysis tools by addressing two usability challenges: false positive error reports and proper tool configuration. First, false positives are one of the main reasons developers give for not using static analysis tools. To address this issue, we developed a novel machine learning approach for learning directly from program code to classify the analysis results as true or false positives. The approach has two steps: (1) data preparation that transforms source code into certain input formats for processing by sophisticated machine learning techniques; and (2) using the sophisticated machine learning techniques to discover code structures that cause false positive error reports and to learn false positive classification models. To evaluate the effectiveness and efficiency of this approach, we conducted a systematic, comparative empirical study of four families of machine learning algorithms, namely hand-engineered features, bag of words, recurrent neural networks, and graph neural networks, for classifying false positives. In this study, we considered two application scenarios using multiple ground-truth program sets. Overall, the results suggest that recurrent neural networks outperformed the other algorithms, although interesting tradeoffs are present among all techniques. Our observations also provide insight into the future research needed to speed the adoption of machine learning approaches in practice. Second, many static program verification tools come with configuration options that present tradeoffs between performance, precision, and soundness to allow users to customize the tools for their needs. However, understanding the impact of these options and correctly tuning the configurations is a challenging task, requiring domain expertise and extensive experimentation. To address this issue, we developed an automatic approach, auto-tune, to configure verification tools for given target programs. The key idea of auto-tune is to leverage a meta-heuristic search algorithm to probabilistically scan the configuration space using machine learning models both as a fitness function and as an incorrect result filter. auto-tune is tool- and language-agnostic, making it applicable to any off-the-shelf configurable verification tool. To evaluate the effectiveness and efficiency of auto-tune, we applied it to four popular program verification tools for C and Java and conducted experiments under two use-case scenarios. Overall, the results suggest that running verification tools using auto-tune produces results that are comparable to configurations manually-tuned by experts, and in some cases improve upon them with reasonable precision.Item Model-Based Testing of Off-Nominal Behaviors(2017) Schulze, Christoph; Cleaveland, Rance; Lindvall, Mikael; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Off-nominal behaviors (ONBs) are unexpected or unintended behaviors that may be exhibited by a system. They can be caused by implementation and documentation errors and are often triggered by unanticipated external stimuli, such as unforeseen sequences of events, out of range data values, or environmental issues. System specifications typically focus on nominal behaviors (NBs), and do not refer to ONBs or their causes or explain how the system should respond to them. In addition, untested occurrences of ONBs can compromise the safety and reliability of a system. This can be very dangerous in mission- and safety-critical systems, like spacecraft, where software issues can lead to expensive mission failures, injuries, or even loss of life. In order to ensure the safety of the system, potential causes for ONBs need to be identified and their handling in the implementation has to be verified and documented. This thesis describes the development and evaluation of model-based techniques for the identification and documentation of ONBs. Model-Based Testing (MBT) techniques have been used to provide automated support for thorough evaluation of software behavior. In MBT, models are used to describe the system under test (SUT) and to derive test cases for that SUT. The thesis is divided into two parts. The first part develops and evaluates an approach for the automated generation of MBT models and their associated test infrastructure. The test infrastructure is responsible for executing the generated test cases of the models. The models and the test infrastructure are generated from manual test cases for web-based systems, using a set of heuristic transformation rules and leveraging the structured nature of the SUT. This improvement to the MBT process was motivated by three case studies of MBT that we conducted that evaluate MBT in terms of its effectiveness and efficiency for identifying ONBs. Our experience led us to develop automated approaches to model and test-infrastructure creation, since these were some of the most time-consuming tasks associated with MBT. The second part of the thesis presents a framework and associated tooling for the extraction and analysis of specifications for identifying and documenting ONBs. The framework infers behavioral specifications in the form of system invariants from automatically generated test data using data-mining techniques (e.g. association-rule mining). The framework follows an iterative test -> infer -> instrument -> retest paradigm, where the initial invariants are refined with additional test data. This work shows how the scalability and accuracy of the resulting invariants can be improved with the help of static data- and control-flow analysis. Other improvements include an algorithm that leverages the iterative process to accurately infer invariants from variables with continuous values. Our evaluations of the framework have shown the utility of such automatically generated invariants as a means for updating and completing system specifications; they also are useful as a means of understanding system behavior including ONBs.Item A Binary Classifier for Test Case Feasibility Applied to Automatically Generated Tests of Event-Driven Software(2016) Robbins, Bryan Thomas; Memon, Atif; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Modern software application testing, such as the testing of software driven by graphical user interfaces (GUIs) or leveraging event-driven architectures in general, requires paying careful attention to context. Model-based testing (MBT) approaches first acquire a model of an application, then use the model to construct test cases covering relevant contexts. A major shortcoming of state-of-the-art automated model-based testing is that many test cases proposed by the model are not actually executable. These \textit{infeasible} test cases threaten the integrity of the entire model-based suite, and any coverage of contexts the suite aims to provide. In this research, I develop and evaluate a novel approach for classifying the feasibility of test cases. I identify a set of pertinent features for the classifier, and develop novel methods for extracting these features from the outputs of MBT tools. I use a supervised logistic regression approach to obtain a model of test case feasibility from a randomly selected training suite of test cases. I evaluate this approach with a set of experiments. The outcomes of this investigation are as follows: I confirm that infeasibility is prevalent in MBT, even for test suites designed to cover a relatively small number of unique contexts. I confirm that the frequency of infeasibility varies widely across applications. I develop and train a binary classifier for feasibility with average overall error, false positive, and false negative rates under 5\%. I find that unique event IDs are key features of the feasibility classifier, while model-specific event types are not. I construct three types of features from the event IDs associated with test cases, and evaluate the relative effectiveness of each within the classifier. To support this study, I also develop a number of tools and infrastructure components for scalable execution of automated jobs, which use state-of-the-art container and continuous integration technologies to enable parallel test execution and the persistence of all experimental artifacts.Item Source Code Reduction to Summarize False Positives(2015) Marenchino, Matias; Porter, Adam; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The main disadvantage of static code analysis tools is the high rates of false positives they produce. Users may need to manually analyze a large number of warnings, to determine if these are false or legitimate warnings, reducing the benefits of automatic static analysis. Our long term goal is to significantly reduce the number of false positives that these tools report. A learning system could classify the warnings into true positives and false positives by means of features extracted from the program source code. This work implements and evaluates a technique to reduce the source code producing false positives into code snippets that are simpler to analyze. Results indicate that the method considerably reduces the source code size and it is feasible to use it to characterize false positives.Item Large Scale Distributed Testing for Fault Classification and Isolation(2010) Fouche, Sandro Maleewatana; Porter, Adam A; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Developing confidence in the quality of software is an increasingly difficult problem. As the complexity and integration of software systems increases, the tools and techniques used to perform quality assurance (QA) tasks must evolve with them. To date, several quality assurance tools have been developed to help ensure of quality in modern software, but there are still several limitations to be overcome. Among the challenges faced by current QA tools are (1) increased use of distributed software solutions, (2) limited test resources and constrained time schedules and (3) difficult to replicate and possibly rarely occurring failures. While existing distributed continuous quality assurance (DCQA) tools and techniques, including our own Skoll project, begin to address these issues, new and novel approaches are needed to address these challenges. This dissertation explores three strategies to do this. First, I present an improved version of our Skoll distributed quality assurance system. Skoll provides a platform for executing sophisticated, long-running QA processes across a large number of distributed, heterogeneous computing nodes. This dissertation details changes to Skoll resulting in a more robust, configurable, and user-friendly implementation for both the client and server components. Additionally, this dissertation details infrastructure development done to support the evaluation of DCQA processes using Skoll -- specifically the design and deployment of a dedicated 120-node computing cluster for evaluating DCQA practices. The techniques and case studies presented in the latter parts of this work leveraged the improvements to Skoll as their testbed. Second, I present techniques for automatically classifying test execution outcomes based on an adaptive-sampling classification technique along with a case study on the Java Architecture for Bytecode Analysis (JABA) system. One common need for these techniques is the ability to distinguish test execution outcomes (e.g., to collect only data corresponding to some behavior or to determine how often and under which conditions a specific behavior occurs). Most current approaches, however, do not perform any kind of classification of remote executions and either focus on easily observable behaviors (e.g., crashes) or assume that outcomes' classifications are externally provided (e.g., by the users). In this work, I present an empirical study on JABA where we automatically classified execution data into passing and failing behaviors using adaptive association trees. Finally, I present a long-term case study of the highly-configurable MySQL open-source project. Exhaustive testing of real-world software systems can involve configuration spaces that are too large to test exhaustively, but that nonetheless contain subtle interactions that lead to failure-inducing system faults. In the literature covering arrays, in combination with classification techniques, have been used to effectively sample these large configuration spaces and to detect problematic configuration dependencies. Applying this approach in practice, however, is tricky because testing time and resource availability are unpredictable. Therefore we developed and evaluated an alternative approach that incrementally builds covering array schedules. This approach begins at a low strength, and then iteratively increases strength as resources allow reusing previous test results to avoid duplicated effort. The results are test schedules that allow for successful classification with fewer test executions and that require less test-subject specific information to develop.Item Behavioral Reflexion Models for Software Architecture(2010) Ackermann, Christopher Florian; Cleaveland, Rance; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Developing and maintaining software is difficult and error prone. This can at least partially be attributed to its constantly evolving nature. Requirements are seldom finalized before the development begins, but evolve constantly as new details about the system become known and stakeholder objectives change. As requirements are altered, the software architecture must be updated to accommodate these changes. This includes updating the architecture documentation, which serves as the design specification as well as a means of comprehending complex systems. Furthermore, the software architecture of the implementation must be adapted in order to ensure that the system complies with both functional and non-functional requirements. In practice, however, software changes are often applied in an ad-hoc manner. As a result, the implementation frequently deviates from the architecture documentation, rendering the latter useless for supporting system engineers in comprehending the system and aiding maintenance tasks. Furthermore, errors that are introduced during the implementation lead to discrepancies between the system and the intended architecture design. Consequently, it cannot be guaranteed that the system meets the desired quality objectives, such as reliability and dependability. We present the behavioral reflexion model approach, which aims to support the system engineer in identifying and resolving discrepancies between software architecture representations. In our approach, the system engineer is supported in producing architecture documentation that reflects the intended architecture. Furthermore, discrepancies between the implementation and documentation are identified. These discrepancies are then illustrated graphically in a reflexion model, which guides debugging activities. In this research, we are concerned with architecture representations of system behaviors and focus in particular on distributed systems. In this thesis, we describe how architecture discrepancies are introduced and the implications for the reliability and maintainability of the system. We then discuss the individual components of the behavioral reflexion model approach in detail. Finally, we provide an evaluation of our approach in the form of two case studies. In these studies, we applied the behavioral reflexion model approach to two space-mission systems with the goal to resolve problems in their reliability and maintainability.Item SeSFJava: A Framework for Design and Assertion-Testing Of Concurrent Systems(2005-08-04) Elsharnouby, Tamer Mahmoud; Shankar, A. Udaya; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Many elegant formalisms have been developed for specifying and reasoning about concurrent systems. However, these formalisms have not been widely used by developers and programmers of concurrent systems. One reason is that most formal methods involve techniques and tools not familiar to programmers, for example, a specification language very different from C, C++ or Java. SeSF is a framework for design, verification and testing of concurrent systems that attempts to address these concerns by keeping the theory close to the programmer's world. SeSF considers "layered compositionality". Here, a composite system consists of layers of component systems, and "services" define the allowed sequences of interactions between layers. SeSF uses conventional programming languages to define services. Specifically, SeSF is a markup language that can be integrated with any programming language. We have integrated SeSF into Java, resulting in what we call SeSFJava. We developed a testing harness for SeSFJava, called SeSFJava Harness, in which a (distributed) SeSFJava program can be executed, and the execution checked against its service and any other correctness assertion. A key capability of the SeSFJava Harness is that one can test the final implementation of a concurrent system, rather than just an abstract representation of it. We have two major applications of SeSFJava and the Harness. The first is to the TCP transport layer, where service specification is cast in SeSFJava and the system is tested under SeSFJava Harness. The second is to a Gnutella network. We define the intended services of Gnutella -- which was not done before to the best of our knowledge -- and we tested an open-source implementation, namely Furi, against the service.