Scalability Analysis of Declustering Methods for Cartesian Product
Files
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Efficient storage and retrieval of multi-attribute datasets
has become one of the essential requirements for many data-intensive
applications. The Cartesian product file has been known as an effective
multi-attribute file structure for partial-match and best-match queries.
Several heuristic methods have been developed to decluster Cartesian
product files over multiple disks to obtain high performance for disk
accesses. Though the scalability of the declustering methods becomes
increasingly important for systems equipped with a large number of disks,
no analytic studies have been done so far.
In this paper we derive formulas describing the scalability
of two popular declustering methods Disk Modulo and Fieldwise Xor
for range queries, which are the most common type of queries.
These formulas disclose the limited scalability of the declustering methods
and are corroborated by extensive simulation experiments.
From the practical point of view, the formulas given in this paper provide a simple measure which can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions.
(Also cross-referenced as UMIACS-TR-96-5)