Querying Very Large Multi-dimensional Datasets in ADR - Extended Abstract

Loading...
Thumbnail Image
Files
CS-TR-4022.ps(405.32 KB)
No. of downloads: 238
CS-TR-4022.pdf(103.96 KB)
No. of downloads: 1032
Publication or External Link
Date
1999-05-26
Authors
Kurc, Tahsin
Chang, Chialin
Ferreira, Renato
Sussman, Alan
Saltz, Joel
Advisor
Citation
DRUM DOI
Abstract
This paper addresses optimizing the execution of range queries into multi-dimensional datasets on distributed memory parallel machines within the Active Data Repository framework. ADR is an infrastructure that integrates storage, retrieval and processing of large multi-dimensional datasets on distributed memory parallel architectures with multiple disks attached to each node. We describe three potential strategies for efficient execution of such queries that employ different tiling and workload partitioning approaches. We evaluate scalability of these strategies for different application scenarios, varying both the number of processors and the input dataset size on a 128 processor IBM SP multicomputer. Also cross-referenced as UMIACS-TR-99-29
Notes
Rights