XSQ: Streaming XPath Queries
XSQ: Streaming XPath Queries
Files
Publication or External Link
Date
2002-09-18
Authors
Peng, Feng
Chawathe, Sudarshan S.
Advisor
Citation
DRUM DOI
Abstract
We describe the design and implementation of XSQ, a system for
evaluating XPath 1.0 queries on streaming XML data. Each XML element in
the input data is presented to the system only once in a serial order
determined by the data source. It is not possible to seek forward or
backward in the data stream, and data cannot be recalled unless explicitly
buffered by the system. Processing XPath queries correctly and
efficiently in this environment is a challenging task and, to the best of
our knowledge, XSQ is the first system that efficiently implements XPath
queries with features such as closures and multiple predicates. XSQ is
efficient in both time and space. Stream query processing typically adds
only 25% to the time required for parsing the stream (and discarding
results). XSQ's space usage is optimal in the sense that it buffers only
data that must be buffered by all streaming query processors. We describe
the formal framework of hierarchical pushdown transducers that forms the
basis of the XSQ system and highlight experimental results on real and
synthetic data.
(Also UMIACS-TR-2002-81)