XSQ: Streaming XPath Queries
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
We describe the design and implementation of XSQ, a system for evaluating XPath 1.0 queries on streaming XML data. Each XML element in the input data is presented to the system only once in a serial order determined by the data source. It is not possible to seek forward or backward in the data stream, and data cannot be recalled unless explicitly buffered by the system. Processing XPath queries correctly and efficiently in this environment is a challenging task and, to the best of our knowledge, XSQ is the first system that efficiently implements XPath queries with features such as closures and multiple predicates. XSQ is efficient in both time and space. Stream query processing typically adds only 25% to the time required for parsing the stream (and discarding results). XSQ's space usage is optimal in the sense that it buffers only data that must be buffered by all streaming query processors. We describe the formal framework of hierarchical pushdown transducers that forms the basis of the XSQ system and highlight experimental results on real and synthetic data. (Also UMIACS-TR-2002-81)