Multi-Dimensional Joins
Multi-Dimensional Joins
Files
Publication or External Link
Date
2007-01-11
Authors
Jacox, Edwin
Advisor
Samet, Hanan
Citation
DRUM DOI
Abstract
We present three novel algorithms for performing multi-dimensional
joins and an in-depth survey and analysis of a low-dimensional
spatial join. The first algorithm, the Iterative Spatial Join,
performs a spatial join on low-dimensional data and is based
on a plane-sweep technique.
As we show analytically and experimentally,
the Iterative Spatial Join performs well when internal memory is
limited, compared to competing methods. This suggests that
the Iterative Spatial Join would be useful for very large data sets
or in situations where internal memory is a shared resource and
is therefore limited, such as with today's database engines which
share internal memory amongst several queries. Furthermore, the
performance of the Iterative Spatial Join is predictable and has
no parameters which need to be tuned, unlike other algorithms.
The second algorithm, the Quickjoin algorithm,
performs a higher-dimensional
similarity join in which pairs of objects that lie within a
certain distance epsilon of each other are reported.
The Quickjoin algorithm overcomes drawbacks of competing methods,
such as requiring embedding methods on the data first or using
multi-dimensional indices, which limit
the ability to discriminate between objects in each
dimension, thereby degrading performance.
A formal analysis is provided of the Quickjoin method, and
experiments show that the Quickjoin method significantly outperforms
competing methods.
The third algorithm adapts
incremental join techniques to improve the
speed of calculating the Hausdorff distance, which
is used in applications such as image matching, image analysis,
and surface approximations.
The nearest neighbor incremental join technique for indices that
are based on hierarchical containment use a priority queue
of index node pairs and bounds on the distance values between
pairs, both of which need to modified in order to calculate the
Hausdorff distance. Results of experiments are described that
confirm the performance improvement.
Finally, a survey is provided which
instead of just summarizing the literature and presenting each
technique in its entirety, describes distinct components of
the different techniques, and each technique is decomposed into
an overall framework for performing a spatial join.