MOCHA: A Self-Extensible Database Middleware System for Distributed Data Sources

Loading...
Thumbnail Image

Files

CS-TR-4105.ps (937.61 KB)
No. of downloads: 309
CS-TR-4105.pdf (306.87 KB)
No. of downloads: 1323

Publication or External Link

Date

2000-04-19

Advisor

Citation

DRUM DOI

Abstract

This paper describes MOCHA, a new self-extensible database middleware system designed to interconnect data sources distributed over a computer network. MOCHA is designed to scale to large environments and is based on the idea that some of the user-defined functionality in the system should be deployed by the middleware itself. This is realized by shipping Java code implementing either advanced data types or tailored query operators to remote data sources and have it executed remotely. Optimized query plans push the evaluation of powerful data-reducing operators to the data source sites while executing data-inflating operators near the client's site. The Volume Reduction Factor is a new and more explicit metric introduced in this paper to select the best site to execute query operators and is shown to be more accurate than the standard selectivity factor alone. MOCHA has been implemented in Java and runs on top of Informix and Oracle. We present the architecture of MOCHA, the ideas behind it, and a performance study using data and queries from the Sequoia 2000 Benchmark. The results of this study demonstrate that MOCHA not only provides a flexible and scalable framework for distributed query processing but also substantially improves query performance in contrast to existing middleware solutions. (Also cross-referenced as UMIACS-TR-2000-05)

Notes

Rights