MOCHA: A Self-Extensible Database Middleware System for Distributed Data Sources

View/ Open
Date
2000-04-19Author
Rodriguez-Martinez, Manuel
Roussopoulos, Nick
Metadata
Show full item recordAbstract
This paper describes MOCHA, a new self-extensible database middleware
system designed to interconnect data sources distributed over a computer
network. MOCHA is designed to scale to large environments and is based
on the idea that some of the user-defined functionality in the system should
be deployed by the middleware itself. This is realized by shipping Java code
implementing either advanced data types or tailored query operators to remote
data sources and have it executed remotely. Optimized query plans push the
evaluation of powerful data-reducing operators to the data source sites while
executing data-inflating operators near the client's site. The Volume Reduction
Factor is a new and more explicit metric introduced in this paper to select the
best site to execute query operators and is shown to be more accurate than
the standard selectivity factor alone. MOCHA has been implemented in Java
and runs on top of Informix and Oracle. We present the architecture of MOCHA,
the ideas behind it, and a performance study using data and queries from the
Sequoia 2000 Benchmark. The results of this study demonstrate that MOCHA
not only provides a flexible and scalable framework for distributed query
processing but also substantially improves query performance in contrast to
existing middleware solutions.
(Also cross-referenced as UMIACS-TR-2000-05)