Query Planning for Range Queries with User-defined Aggregation on
Multi-dimensional Scientific Datasets
Query Planning for Range Queries with User-defined Aggregation on
Multi-dimensional Scientific Datasets
Loading...
Files
Publication or External Link
Date
1999-02-23
Authors
Chang, Chialin
Kurc, Tahsin
Sussman, Alan
Saltz, Joel
Advisor
Citation
DRUM DOI
Abstract
Applications that make use of very large scientific datasets have
become an increasingly important subset of scientific applications. In
these applications, the datasets are often multi-dimensional, i.e.,
data items are associated with points in a multi-dimensional attribute
space. The processing is usually highly stylized, with the basic
processing steps consisting of (1) retrieval of a subset of all
available data in the input dataset via a range query, (2) projection
of each input data item to one or more output data items, and (3) some
form of aggregation of all the input data items that project to the
each output data item. We have developed an infrastructure, called the
Active Data Repository (ADR), that integrates storage, retrieval and
processing of multi-dimensional datasets on shared-nothing
architectures. In this paper we address query planning and execution
strategies for range queries with user-defined processing. We evaluate
three potential query planning strategies within the ADR framework
under several application scenarios, and present experimental results
on the performance of the strategies on a multiprocessor IBM SP2.
(Also cross-refereced as UMIACS-TR-99-15)