Querying Very Large Multi-dimensional Datasets in ADR - Extended Abstract

View/ Open
Date
1999-05-26Author
Kurc, Tahsin
Chang, Chialin
Ferreira, Renato
Sussman, Alan
Saltz, Joel
Metadata
Show full item recordAbstract
This paper addresses optimizing the execution of range queries into
multi-dimensional datasets on distributed memory parallel machines within
the Active Data Repository framework. ADR is an infrastructure that
integrates storage, retrieval and processing of large multi-dimensional
datasets on distributed memory parallel architectures with multiple disks
attached to each node. We describe three potential strategies for
efficient execution of such queries that employ different tiling and
workload partitioning approaches. We evaluate scalability of these
strategies for different application scenarios, varying both the number of
processors and the input dataset size on a 128 processor IBM SP
multicomputer.
Also cross-referenced as UMIACS-TR-99-29