UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    Continuous, Effort-Aware Prediction of Software Security Defects
    (2015) Stuckman, Jeffrey Charles; Purtilo, James; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Software security defects are coding flaws which allow for a system's security to be compromised. Due to the potential severity of these defects, it is important to discover them quickly; therefore, they are a good focus for software quality improvement efforts such as code inspection. Our research focuses on vulnerability prediction models, which use machine learning to identify code that has an elevated likelihood of containing these defects. In particular, we study continuous prediction models, which repeatedly search for vulnerable code over a period of time, rather than being used at just one particular moment. To empirically evaluate the prediction methodologies that we define, we collected a fine-grained dataset of vulnerabilities in PHP applications. We then defined and implemented a method for defining families of features, or metrics, which characterize both the change in code over time and the state of the code at a given moment, enabling a systematic and fair comparison of continuous and traditional prediction models. We also introduce a methodology for effort-sensitive learning, which optimizes to minimize the expected cost of inspecting the code that is ultimately flagged by the model. Our results show that the security defects in our dataset were long-lived, with a median lifetime of 871 days. Continuous prediction more readily discriminated vulnerable from non-vulnerable code than traditional static prediction did, and prediction was more efficient when changes were broken apart by file than when they were aggregated together. However, high code churn negated some of the efficiency gains of continuous predictors in simulations, and the optimal prediction method in a given scenario depended on making a tradeoff between speed of detection and cost savings. As an additional contribution, we have released the fine-grained defect dataset -- the first of its kind -- to the public, in order to encourage future work in this field.
  • Thumbnail Image
    Item
    COMPATIBILITY TESTING FOR COMPONENT-BASED SYSTEMS
    (2010) YOON, ILCHUL; Sussman, Alan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Many component-based systems are deployed in diverse environments, each with different components and with different component versions. To ensure the system builds correctly for all deployable combinations (or, configurations), developers often perform compatibility testing by building their systems on various configurations. However, due to the large number of possible configurations, testing all configurations is often infeasible, and in practice, only a handful of popular configurations are tested; as a result, errors can escape to the field. This problem is compounded when components evolve over time and when test resources are limited. To address these problems, in this dissertation I introduce a process, algorithms and a tool called Rachet. First, I describe a formal modeling scheme for capturing the system configuration space, and a sampling criterion that determines the portion of the space to test. I describe an algorithm to sample configurations satisfying the sampling criterion and methods to test the sampled configurations. Second, I present an approach that incrementally tests compatibility between components, so as to accommodate component evolution. I describe methods to compute test obligations, and algorithms to produce configurations that test the obligations, attempting to reuse test artifacts. Third, I present an approach that prioritizes and tests configurations based on developers' preferences. Configurations are tested, by default starting from the most preferred one as requested by a developer, but cost-related factors are also considered to reduce overall testing time. The testing approaches presented are applied to two large-scale systems in the high-performance computing domain, and experimental results show that the approaches can (1) identify compatibility between components effectively and efficiently, (2) make the process of compatibility testing more practical under constant component evolution, and also (3) help developers achieve preferred compatibility results early in the overall testing process when time and resources are limited.
  • Thumbnail Image
    Item
    Development of An Empirical Approach to Building Domain-Specific Knowledge Applied to High-End Computing
    (2006-07-23) Hochstein, Lorin Michael; Basili, Victor R; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation presents an empirical approach for building and storing knowledge about software engineering through human-subject research. It is based on running empirical studies in stages, where previously held hypotheses are supported or refuted in different contexts, and new hypotheses are generated. The approach is both mixed-methods based and opportunistic, and focuses on identifying a diverse set of potential sources for running studies. The output produced is an experience base which contains a set of these hypotheses, the empirical evidence which generated them, and the implications for practitioners and researchers. This experience base is contained in a software system which can be navigated by stakeholders to trace the "chain of evidence" of hypotheses as they evolve over time and across studies. This approach has been applied to the domain of high-end computing, to build knowledge related to programmer productivity. The methods include controlled experiments and quasi-experiments, case studies, observational studies, interviews, surveys, and focus groups. The results of these studies have been stored in a proof-of-concept system that implements the experience base.