Multi-Dimensional Joins
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
We present three novel algorithms for performing multi-dimensional joins and an in-depth survey and analysis of a low-dimensional spatial join. The first algorithm, the Iterative Spatial Join, performs a spatial join on low-dimensional data and is based on a plane-sweep technique. As we show analytically and experimentally, the Iterative Spatial Join performs well when internal memory is limited, compared to competing methods. This suggests that the Iterative Spatial Join would be useful for very large data sets or in situations where internal memory is a shared resource and is therefore limited, such as with today's database engines which share internal memory amongst several queries. Furthermore, the performance of the Iterative Spatial Join is predictable and has no parameters which need to be tuned, unlike other algorithms.
The second algorithm, the Quickjoin algorithm,
performs a higher-dimensional
similarity join in which pairs of objects that lie within a
certain distance epsilon of each other are reported.
The Quickjoin algorithm overcomes drawbacks of competing methods,
such as requiring embedding methods on the data first or using
multi-dimensional indices, which limit
the ability to discriminate between objects in each
dimension, thereby degrading performance.
A formal analysis is provided of the Quickjoin method, and
experiments show that the Quickjoin method significantly outperforms
competing methods.
The third algorithm adapts
incremental join techniques to improve the
speed of calculating the Hausdorff distance, which
is used in applications such as image matching, image analysis,
and surface approximations.
The nearest neighbor incremental join technique for indices that
are based on hierarchical containment use a priority queue
of index node pairs and bounds on the distance values between
pairs, both of which need to modified in order to calculate the
Hausdorff distance. Results of experiments are described that
confirm the performance improvement.
Finally, a survey is provided which instead of just summarizing the literature and presenting each technique in its entirety, describes distinct components of the different techniques, and each technique is decomposed into an overall framework for performing a spatial join.