Authors: David Gotz, Jonathan Zhang, Wenyuan Wang, Joshua Shrestha, David Borland
Abstract: Temporal event data are collected across a broad range of domains, and a variety of visual analytics techniques have been developed to empower analysts working with this form of data. These techniques generally display aggregate statistics computed over sets of event sequences that share common patterns. Such techniques are often hindered, however, by the high-dimensionality of many real-world event sequence datasets because the large number of distinct event types within such data prevents effective aggregation. A common coping strategy for this challenge is to group event types together as a pre-process, prior to visualization, so that each group can be represented within an analysis as a single event type. This approach can be highly effective because it directly reduces the dimensionality of the event dataset. However, computing these event groupings as a pre-process also places significant constraints on the analysis. This paper presents a new visual analytics approach for dynamic hierarchical dimension aggregation. The approach leverages a predefined hierarchy of dimensions to computationally quantify the informativeness of alternative levels of grouping within the hierarchy at runtime. This information is then interactively visualized, enabling users to dynamically explore the hierarchy to select the most appropriate level of grouping to use at any individual step within an analysis. Key contributions include an efficient and tunable algorithm for interactively determining the most informative set of event groupings for a specific analysis context from within a large-scale hierarchy of event types, and a scented scatter-plus-focus visualization design with an optimization-based layout algorithm that supports interactive hierarchical exploration of alternative event type groupings. While these contributions are generalizable to other types of problems, we apply them to high-dimensional event sequence analysis using large-scale event type hierarchies from the medical domain. We describe their use within a medical cohort analysis tool called Cadence, demonstrate an example in which the proposed technique supports better views of event sequence data, and report findings from domain expert interviews.