E XPLORING DIFFERENCES IN MULTIVARIATE DATASETS USING HIERARCHIES AN INTERACTIVE INFORMATION VISUALIZATION APPROACH
Guerra Gómez, John Alexis
MetadataShow full item record
Hierarchies are a useful way of representing data. The parent-child relationships they define facilitate the analysis of a dataset by breaking it down into its component parts. Representing data as hierarchies can also be used to track changes to a dataset over time or between versions. For example, analysts can use hierarchies to uncover changes in the US Federal Budget in the last twenty years, by grouping accounts by Agencies and Bureaus. Similarly, a company manager can analyze changes to their product sales due to the holiday season by breaking them up by markets and product categories. Exploring differences in such trees could help them understand changes in the data. However, comparing hierarchies is a difficult task, even when comparing two trees with a small number of nodes. To address this, information visualization techniques were used to support data comparison tasks using hierarchies. After evaluating my techniques with domain experts on real world problems, I identified and addressed two main research topics: Abstract This dissertation first tackled the problem of comparing two versions of a tree by using two types of change, while most of the significant work on this topic has focused only on changes in node values or changes in topology. TreeVersity (http://hcil.cs.umd.edu/treeversity) is a comparison tool that allows users to explore changes between two versions of a tree by tracking node value differences, and newly created or removed nodes. Domain experts using TreeVersity were excited to discover differences in the trees, but expressed a desire to explore the evolution of a dataset over time. To that end, they suggested applying TreeVersity comparison capabilities to datasets that were non inherently hierarchical. Abstract Following users' feedback, the problem of exploring changes over time in datasets that can be categorized as trees was addressed next. TreeVersity2 (http://treeversity.cattlab.umd.edu is a web-based data comparison tool that allows users to explore a tree that changes over time and of datasets that are not inherently hierarchical, by categorizing them by their attributes. TreeVersity2 also helps users navigate the sometimes large amounts of differences between versions of a tree using an interactive textual reporting tool. Abstract My research has resulted in three main contributions: First, the introduction of the Bullet, a visualization glyph to represent four characteristics of change (as described in Section 1.2) in tree nodes, and the implementation of the Bullet in TreeVersity. Second, the creation of the StemView, a tree visualization technique that represents five characteristics of change in all the nodes of a tree (not just the leaves), and the implementation of the StemView in TreeVersity2. Furthermore, my research resulted in the development of the reporting tool, another feature of TreeVersity2, which helps users navigate outstanding changes in the tree with textual representations and coordinated interactions. Third, the development of 13 case studies with domain experts on real world comparison problems. The case studies have validated the utility and flexibility of my approaches. Finally, my research opens possibilities for future research on comparing hierarchical structures.