Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Interactive Event Sequence Query and Transformation

    Thumbnail
    View/Open
    Monroe_umd_0117E_15112.pdf (22.66Mb)
    No. of downloads: 2022

    Date
    2014
    Author
    Monroe, Megan
    Advisor
    Shneiderman, Ben
    Plaisant, Catherine
    Metadata
    Show full item record
    Abstract
    In our burgeoning world of pervasive sensors and affordable data storage, records of timestamped events are being produced across nearly every domain of personal and professional computing. This temporal event data is a fundamental component of electronic health records, process logs, sports analytics, and more. Across all domains, however, are two overarching needs: (1) to understand population-level trends and patterns, and (2) to identify important subsets of individual records. Visual analytics tools are billed as the solution to both of these problems. A huge volume of work has demonstrated the ability of these tools to facilitate user-guided data exploration and hypothesis generation across a wide range of data types. What is typically ignored however, is the process that takes place between the data collection and this exploration stage, a process frequently referred to as data wrangling. For many data types, wrangling consists mostly of restructuring spreadsheet columns and renaming fields. For temporal event data though, this wrangling process can extend much further---to the data itself---where event patterns must be transformed to better reflect either the real world events that generated them or the perspective of a given study. Without this step, population-level trends can be obscured beyond the point of recognition, and important subsets of records are impossible to discern. Temporal event data wrangling, however, is deceivingly difficult and error prone even for expert users. Standard, command-based query languages are poorly suited for specifying even the simplest event patterns and, in systems that are not precisely designed for handling temporal constructs, these queries are executed using a series of slow and inefficient self-join operations. Attempts at more accessible query languages frequently omit critical features such as events that occur over a period of time (intervals) or the absence of an event. Perhaps most importantly is that query alone is not enough to get users through a typical temporal event data wrangling process. Event patterns not only need to be found, but also transformed and re-represented. Temporal event wrangling is just as much about revisal as it is about retrieval, and given the ubiquity of this data type, an effective solution on this front has the potential to hugely impact the way that we utilize this data to inform future decisions. An improved query and wrangling process would not only benefit database professionals, but also dramatically increase the range of users who can access this type of data, particularly domain expert medical researchers. This dissertation demonstrates the ability of the EventFlow visualization tool to extend beyond the typical bounds of data exploration, and serve as a critical aid for both temporal event query and data transformation. I begin by establishing a better understanding of why these two processes are innately error prone, and introduce a simple set of powerful yet usable mechanisms that can help reduce an initial portion of these errors. I then show that by coupling these mechanisms with interactive visualizations, users are able to both identify remaining errors and leverage those errors to construct more accurate queries and transformations. The direct contributions of this dissertation are (1) a graphic-based query capabilities over points, intervals, and absences, (2) an integer programming strategy for processing temporal queries, (3) a Find & Replace system for transforming event sequences, and (4) eight case studies that demonstrate the utility and validity of these approaches. However, this work is designed more broadly to open new avenues of research in how visualization and visual analytics tools can be leveraged for tasks beyond data exploration.
    URI
    http://hdl.handle.net/1903/15305
    Collections
    • Computer Science Theses and Dissertations
    • UMD Theses and Dissertations

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility