Browsing by Author "Chawathe, Sudarshan S."
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Context-Sensitive Search and Exploration of XML Text(2001-05-10) Baby, Thomas; Chawathe, Sudarshan S.XML permits documents with arbitrary nested context (tag structure). We investigate how this context may be used to aid the task of searching and exploring XML text. We describe the design and implementation of the Cextor system, which includes a context-sensitive text-search engine and a novel technique for organizing and exploring very large search results based on context. A distinguishing feature of this technique is that it does not assume search results are of modest size. Rather, it is designed to cope with search results that are potentially the size of the database. We present the results of an experimental evaluation of Cextor on derived data from the Web. (Cross-referenced as UMIACS-TR-2001-12)Item Cooperative Data Dissemination in a Serverless Environment(2004-02-25) Anand, Abheek; Chawathe, Sudarshan S.We describe the design and implementation of CoDD, a system for cooperative data dissemination in a serverless environment. CoDD allows a set of loosely-coupled, distributed peer nodes to receive subsets of a data stream, by describing interests using subscription queries. CoDD maintains statistical information on the characteristics of data and queries, and uses it to organize nodes in a hierarchical, data-aware topology. Data is disseminated using overlays which are created to try to minimize the excess data flowing through the system, while maintaining low latency and satisfying fanout constraints. CoDD is designed to allow nodes to determine individual degrees of contribution to the system, and to adapt gracefully to temporal changes in the data distribution using an adaptive reorganization component. We present the results of our experimental evaluation of CoDD. (UMIACS-TR-2004-07)Item Privacy-Preserving Inter-Database Operations(2004-03-25) Liang, Gang; Chawathe, Sudarshan S.We present protocols for distributed computation of relational intersections and equi-joins such that each site gains no information about the tuples at the other site that do not intersect or join with its own tuples. Such protocols form the building blocks of distributed information systems that manage sensitive information, such as patient records and financial transactions, that must be shared in only a limited manner. We discuss applications of our protocols, outlining the ramifications of assumptions such as semi-honesty. In addition to improving on the efficiency of earlier protocols, our protocols are asymmetric, making them especially applicable to applications in which a low-powered client interacts with a server in a privacy-preserving manner. We present a brief experimental study of our protocols. (UMIACS-TR-2004-09)Item Skipping Streams with XHints(2004-03-25) Gupta, Akhil; Chawathe, Sudarshan S.When streaming semi-structured data is processed by a well-designed query processor, parsing constitutes a significant portion of the running time. Further improvements in performance therefore require some method to overcome the high cost of parsing. We have designed a general-purpose mechanism by which a producer of streaming data may augment the data stream with {hints} that permit a downstream processor to skip parsing parts of the stream. Inserting such hints requires additional processing by the producer of data; however, the resulting stream is more valuable to consumers (since they have to perform less processing) , making such processing worthwhile. We present a set of hint schemes and describe how they are used by query engines. We demonstrate the benefits of our approach using an experimental study based on a hints-aware XPath query engine. Our results show that XHints can improve the performance of XPath query engines by as much as 100\%. (UMIACS-TR-2004-11)Item XSQ: A Streaming XPath Engine(2003-08-01) Peng, Feng; Chawathe, Sudarshan S.We have implemented and released the XSQ system for evaluating XPath queries on streaming XML data. XSQ supports XPath features such as multiple predicates, closures, and aggregation, which pose interesting challenges for streaming evaluation. Our implementation is based on using a hierarchical arrangement of pushdown transducers augmented with buffers. A notable feature of XSQ is that it buffers data for only as long as it must be buffered by any streaming XPath query engine. We present a detailed experimental study that characterizes the performance of XSQ and related systems, and illustrates the performance implications of XPath features such as closures. (UMIACS-TR-2003-62)Item XSQ: Streaming XPath Queries(2002-09-18) Peng, Feng; Chawathe, Sudarshan S.We describe the design and implementation of XSQ, a system for evaluating XPath 1.0 queries on streaming XML data. Each XML element in the input data is presented to the system only once in a serial order determined by the data source. It is not possible to seek forward or backward in the data stream, and data cannot be recalled unless explicitly buffered by the system. Processing XPath queries correctly and efficiently in this environment is a challenging task and, to the best of our knowledge, XSQ is the first system that efficiently implements XPath queries with features such as closures and multiple predicates. XSQ is efficient in both time and space. Stream query processing typically adds only 25% to the time required for parsing the stream (and discarding results). XSQ's space usage is optimal in the sense that it buffers only data that must be buffered by all streaming query processors. We describe the formal framework of hierarchical pushdown transducers that forms the basis of the XSQ system and highlight experimental results on real and synthetic data. (Also UMIACS-TR-2002-81)