On Cost-effectiveness of a Semijoin in Distributed Query Processing.
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
The cost-effective reduction of relations by semijoins is the basis of the heuristic approach to distributed query processing. The cost-effectiveness of a semijoin was simply determined in the literature assuming that the local processing cost is negligible compared to the data transmission cost in distributed query processing. However, recently questions have been raised about the validity of the assumption, and some experimental works revealed that the local processing cost is also significant in distributed query processing. In this paper, we are concerned with the cost-effectiveness of a semijoin considering the local processing cost as well as the data transmission cost. To measure the effectiveness of a semijoin in terms of the local processing cost, we introduce the join sequence in which the relations are joined at the result site to answer the query. A dynamic programming algorithm is developed to generate the optimal join sequence for a given query. A simple heuristic algorithm is also developed to generate a join sequence for a given query.