UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 5 of 5
  • Thumbnail Image
    Item
    Situated Analytics for Data Scientists
    (2022) Batch, Andrea; Elmqvist, Niklas E; Library & Information Services; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Much of Mark Weiser's vision of ``ubiquitous computing'' has come to fruition: We live in a world of interfaces that connect us with systems, devices, and people wherever we are. However, those of us in jobs that involve analyzing data and developing software find ourselves tied to environments that limit when and where we may conduct our work; it is ungainly and awkward to pull out a laptop during a stroll through a park, for example, but difficult to write a program on one's phone. In this dissertation, I discuss the current state of data visualization in data science and analysis workflows, the emerging domains of immersive and situated analytics, and how immersive and situated implementations and visualization techniques can be used to support data science. I will then describe the results of several years of my own empirical work with data scientists and other analytical professionals, particularly (though not exclusively) those employed with the U.S. Department of Commerce. These results, as they relate to visualization and visual analytics design based on user task performance, observations by the researcher and participants, and evaluation of observational data collected during user sessions, represent the first thread of research I will discuss in this dissertation. I will demonstrate how they might act as the guiding basis for my implementation of immersive and situated analytics systems and techniques. As a data scientist and economist myself, I am naturally inclined to want to use high-frequency observational data to the end of realizing a research goal; indeed, a large part of my research contributions---and a second ``thread'' of research to be presented in this dissertation---have been around interpreting user behavior using real-time data collected during user sessions. I argue that the relationship between immersive analytics and data science can and should be reciprocal: While immersive implementations can support data science work, methods borrowed from data science are particularly well-suited for supporting the evaluation of the embodied interactions common in immersive and situated environments. I make this argument based on both the ease and importance of collecting spatial data from user sessions from the sensors required for immersive systems to function that I have experienced during the course of my own empirical work with data scientists. As part of this thread of research working from this perspective, this dissertation will introduce a framework for interpreting user session data that I evaluate with user experience researchers working in the tech industry. Finally, this dissertation will present a synthesis of these two threads of research. I combine the design guidelines I derive from my empirical work with machine learning and signal processing techniques to interpret user behavior in real time in Wizualization, a mid-air gesture and speech-based augmented reality visual analytics system.
  • Thumbnail Image
    Item
    Towards Fast and Efficient Representation Learning
    (2018) Li, Hao; Samet, Hanan; Goldstein, Thomas; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The success of deep learning and convolutional neural networks in many fields is accompanied by a significant increase in the computation cost. With the increasing model complexity and pervasive usage of deep neural networks, there is a surge of interest in fast and efficient model training and inference on both cloud and embedded devices. Meanwhile, understanding the reasons for trainability and generalization is fundamental for its further development. This dissertation explores approaches for fast and efficient representation learning with a better understanding of the trainability and generalization. In particular, we ask following questions and provide our solutions: 1) How to reduce the computation cost for fast inference? 2) How to train low-precision models on resources-constrained devices? 3) What does the loss surface looks like for neural nets and how it affects generalization? To reduce the computation cost for fast inference, we propose to prune filters from CNNs that are identified as having a small effect on the prediction accuracy. By removing filters with small norms together with their connected feature maps, the computation cost can be reduced accordingly without using special software or hardware. We show that simple filter pruning approach can reduce the inference cost while regaining close to the original accuracy by retraining the networks. To further reduce the inference cost, quantizing model parameters with low-precision representations has shown significant speedup, especially for edge devices that have limited computing resources, memory capacity, and power consumption. To enable on-device learning on lower-power systems, removing the dependency of full-precision model during training is the key challenge. We study various quantized training methods with the goal of understanding the differences in behavior, and reasons for success or failure. We address the issue of why algorithms that maintain floating-point representations work so well, while fully quantized training methods stall before training is complete. We show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic. Finally, we explore the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods. We introduce a simple filter normalization method that helps us visualize loss function curvature, and make meaningful side-by-side comparisons between loss functions. The sharpness of minimizers correlates well with generalization error when this visualization is used. Then, using a variety of visualizations, we explore how training hyper-parameters affect the shape of minimizers, and how network architecture affects the loss landscape.
  • Thumbnail Image
    Item
    Integrating Statistics and Visualization to Improve Exploratory Social Network Analysis
    (2008-08-21) Perer, Adam; Shneiderman, Ben; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Social network analysis is emerging as a key technique to understanding social, cultural and economic phenomena. However, social network analysis is inherently complex since analysts must understand every individual's attributes as well as relationships between individuals. There are many statistical algorithms which reveal nodes that occupy key social positions and form cohesive social groups. However, it is difficult to find outliers and patterns in strictly quantitative output. In these situations, information visualizations can enable users to make sense of their data, but typical network visualizations are often hard to interpret because of overlapping nodes and tangled edges. My first contribution improves the process of exploratory social network analysis. I have designed and implemented a novel social network analysis tool, SocialAction (http://www.cs.umd.edu/hcil/socialaction) , that integrates both statistics and visualizations to enable users to quickly derive the benefits of both. Statistics are used to detect important individuals, relationships, and clusters. Instead of tabular display of numbers, the results are integrated with a network visualization in which users can easily and dynamically filter nodes and edges. The visualizations simplify the statistical results, facilitating sensemaking and discovery of features such as distributions, patterns, trends, gaps and outliers. The statistics simplify the comprehension of a sometimes chaotic visualization, allowing users to focus on statistically significant nodes and edges. SocialAction was also designed to help analysts explore non-social networks, such as citation, communication, financial and biological networks. My second contribution extends lessons learned from SocialAction and provides designs guidelines for interactive techniques to improve exploratory data analysis. A taxonomy of seven interactive techniques are augmented with computed attributes from statistics and data mining to improve information visualization exploration. Furthermore, systematic yet flexible design goals are provided to help guide domain experts through complex analysis over days, weeks and months. My third contribution demonstrates the effectiveness of long term case studies with domain experts to measure creative activities of information visualization users. Evaluating information visualization tools is problematic because controlled studies may not effectively represent the workflow of analysts. Discoveries occur over weeks and months, and exploratory tasks may be poorly defined. To capture authentic insights, I designed an evaluation methodology that used structured and replicated long-term case studies. The methodology was implemented on unique domain experts that demonstrated the effectiveness of integrating statistics and visualization.
  • Thumbnail Image
    Item
    Construction of Test Facility to Measure and Visualize Refrigerant Maldistribution in Multiport Evaporator Headers
    (2005-07-18) Linde, John Eric; Radermacher, Reinhard; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In a refrigeration cycle, condensed liquid refrigerant is expanded to a two-phase fluid entering the evaporator. In many applications, the evaporator paths are divided into a number of parallel sections to keep the pressure drop across the evaporator within a reasonable range and to maximize overall heat exchanger performance. Since the state of the refrigerant entering the evaporator is two-phase and its quality changes depending upon the operating conditions, the proper refrigerant distribution to individual sections is not an easy task. Nonuniform distribution, or maldistribution, will cause dry out at sections of lesser mass flow by superheating the refrigerant gas. This can result in nonuniform heat exchanger surface temperature distribution. Single-phase heat transfer coefficients (HTCs) are much lower than those of two-phase HTCs. When dryout occurs, both refrigerant-side HTCs and air-side HTCs are lower than those of wet surfaces. In addition to this, the temperature difference between the air and
  • Thumbnail Image
    Item
    Temporal Treemaps for Visualizing Time Series Data
    (2004-05-12) Chintalapani, Gouthami; Shneiderman, Ben; Plaisant, Catherine; Systems Engineering
    Treemap is an interactive graphical technique for visualizing large hierarchical information spaces using nested rectangles in a space filling manner. The size and color of the rectangles show data attributes and enable users to spot trends, patterns or exceptions. Current implementations of treemaps help explore time-invariant data. However, many real-world applications require monitoring hierarchical, time-variant data. This thesis extends treemaps to interactively explore time series data by mapping temporal changes to color attribute of treemaps. Specific contributions of this thesis include: · Temporal treemaps for exploring time series data through visualizing absolute or relative changes, animating them over time, filtering data items, and discovering trends using time series graphs. · The design and implementation of extensible software modules based on systems engineering methodologies and object-oriented approach. · Validation through five case studies: health statistics, web logs, production data, birth statistics, and help-desk tickets; future improvements identified from the user feedback.