An Effective Approach to Temporally Anchored Information Retrieval

dc.contributor.authorWei, Zheng
dc.contributor.authorJaJa, Joseph
dc.date.accessioned2012-08-18T03:23:54Z
dc.date.available2012-08-18T03:23:54Z
dc.date.issued2012-08-17
dc.description.abstractWe consider in this paper the information retrieval problem over a collection of time-evolving documents such that the search has to be carried out based on a query text and a temporal specification. A solution to this problem is critical for a number of emerging large scale applications involving archived collections of web contents, social network interactions, blog traffic, and information feeds. Given a collection of time-evolving documents, we develop an effective strategy to create inverted files and indexing structures such that a temporally anchored query can be processed fast using similar strategies as in the non-temporal case. The inverted files generated have exactly the same structure as those generated for the classical (non-temporal) case, and the size of the additional indexing structures is shown to be small. Well-known previous algorithms for constructing inverted files or for computing relevance can be extended to handle the temporal case. Moreover, we present high throughput, scalable parallel algorithms to build the inverted files with the additional indexing structures on multicore processors and clusters of multicore processors. We illustrate the effectiveness of our approach through experimental tests on a number of web archives, and include a comparison of space used by the indexing structures and postings lists and search time between our approach and the traditional approach that ignores the temporal information.en_US
dc.identifier.urihttp://hdl.handle.net/1903/12879
dc.language.isoen_USen_US
dc.relation.ispartofseriesUMIACS;UMIACS-TR-2012-10
dc.titleAn Effective Approach to Temporally Anchored Information Retrievalen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-5012.pdf
Size:
1.28 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.57 KB
Format:
Item-specific license agreed upon to submission
Description: