High Performance Spatial Indexing for Parallel I/O and Centralized Architectures
Abstract
Recently, spatial databases have attracted increasing interest in the
database field. Because of the volume of the data with which they deal
with, the performance of spatial database systems' is important. The
R-tree is an efficient spatial access method. It is a generalization of
the B-tree in multidimensional space. This thesis investigates how to
improve the performance of R-trees. We consider both parallel I/O and
centralized architectures.
For a parallel I/O environment we propose an R-tree design for a server
with one CPU and multiple disks. On this architecture, the nodes of the
R-tree are distributed between the different disks with cross-disk
pointers ( 'Multiplezed R-tree a). When a new node is created we have to
decide on which disk it will be stored. We propose and examine several
criteria for choosing a disk for a new node. The most successful one,
termed 'Prozimity Indew' or PI, estimates the similarity of the new node
to other R-tree nodes already on a disk and chooses the disk with the
least degree of similarity.
For a centralized environment, we propose a new packing technique for
R-trees for static databases. We use space-filling curves, and
specifically the Hilbert curve, to achieve better ordering of rectangles
and eventually to achieve better packing. For dynamic databases we
introduce the filbert R-tree, in which every node has a well defined set
of sibling nodes; we can thus use the concept of local rotation [47]. By
adjusting the split policy, the Filbert R-tree can achieve a degree of
space utilization as high as is desired.
(Also cross-referenced as UMIACS-TR-94-131)