A Comparative Study of Spatial Indexing Techniques for Multidimensional Scientific Datasets

Loading...
Thumbnail Image

Files

CS-TR-4556.ps (449.16 KB)
No. of downloads: 348
CS-TR-4556.pdf (173.46 KB)
No. of downloads: 1403

Publication or External Link

Date

2004-01-29

Advisor

Citation

DRUM DOI

Abstract

Scientific applications that query into very large multidimensional datasets are becoming more common. These datasets are growing in size every day, and are becoming truly enormous, making it infeasible to index individual data elements. We have instead been experimenting with {\em chunking} the datasets to index them, grouping data elements into small chunks of a fixed, but dataset-specific, size to take advantage of spatial locality. While spatial indexing structures based on R-trees perform reasonably well for the rectangular bounding boxes of such chunked
datasets, other indexing structures based on KDB-trees, such as Hybrid trees, have been shown to perform very well for point data. In this paper, we investigate how all these indexing structures perform for multidimensional scientific datasets, and compare their features and performance with that
of {\bf SH-trees}, an extension of Hybrid trees, for indexing multidimensional rectangles. Our experimental results show that the
algorithms for building and searching SH-trees outperform those for R-trees, R*-trees, and X-trees for both real application and synthetic datasets and queries. We show that the SH-tree algorithms perform well for both low and high dimensional data, and that they scale well to high dimensions both for building and searching the trees. (UMIACS-TR-2004-03)

Notes

Rights