Temporal Context Modeling for Text Streams

Thumbnail Image


Rao_umd_0117E_19268.pdf (5.53 MB)
No. of downloads: 3714

Publication or External Link





There is increasing recognition that time plays an essential role in many information seeking tasks. This dissertation explores temporal models on evolving streams of text and the role that such models play in improving information access. I consider two cases: a stream of social media posts by many users for tweet search and a stream of queries by an individual user for voice search. My work explores the relationship between temporal models and context models: for tweet search, the evolution of an event serves as the context of clustering relevant tweets; for voice search, the user's history of queries provides the context for helping understand her true information need.

First, I tackle the tweet search problem by modeling the temporal contexts of the underlying collection. The intuition is that an information need in Twitter usually correlates with a breaking news event, thus tweets posted during that event are more likely to be relevant. I explore techniques to model two different types of temporal signals: pseudo trend and query trend. The pseudo trend is estimated through the distribution of timestamps from an initial list of retrieved documents given a query, which I model through continuous hidden Markov approach as well as neural network-based methods for relevance ranking and sequence modeling. As an alternative, the query trend, is directly estimated from the temporal statistics of query terms, obviating the need for an initial retrieval. I propose two different approaches to exploit query trends: a linear feature-based ranking model and a regression-based model that recover the distribution of relevant documents directly from query trends. Extensive experiments on standard Twitter collections demonstrate the superior effectivenesses of my proposed techniques.

Second, I introduce the novel problem of voice search on an entertainment platform, where users interact with a voice-enabled remote controller through voice requests to search for TV programs. Such queries range from specific program navigation (i.e., watch a movie) to requests with vague intents and even queries that have nothing to do with watching TV. I present successively richer neural network architectures to tackle this challenge based on two key insights: The first is that session context can be exploited to disambiguate queries and recover from ASR errors, which I operationalize with hierarchical recurrent neural networks. The second insight is that query understanding requires evidence integration across multiple related tasks, which I identify as program prediction, intent classification, and query tagging. I present a novel multi-task neural architecture that jointly learns to accomplish all three tasks. The first model, already deployed in production, serves millions of queries daily with an improved customer experience. The multi-task learning model is evaluated on carefully-controlled laboratory experiments, which demonstrates further gains in effectiveness and increased system capabilities. This work now serves as the core technology in Comcast Xfinity X1 entertainment platform, which won an Emmy award in 2017 for the technical contribution in advancing television technologies.

This dissertation presents families of techniques for modeling temporal information as contexts to assist applications with streaming inputs, such as tweet search and voice search. My models not only establish the state-of-the-art effectivenesses on many related tasks, but also reveal insights of how various temporal patterns could impact real information-seeking processes.