MOCHA: A Self-Extensible Middleware Substrate for Distributed Data
Sources
Files
Publication or External Link
Date
Advisor
Citation
DRUM DOI
Abstract
This paper describes MOCHA, a self-extensible middleware substrate designed to
interconnect data sources distributed over a computer network. MOCHA is
designed to scale to large environments and is based on the idea that the
functionality in the system should be deployed by the middleware itself.
This is realized by shipping the code implementing either advanced data
types or tailored query operators to remote data sources and have it
executed remotely. Optimized query plans push the evaluation of powerful
data-reducing operators to the data sites while executing data-inflating
operators at the client's site. The Volume Reduction Factor is a new cost
metric introduced to select the best site to execute query operators and
is shown to be more accurate than the standard selectivity factor. MOCHA
has been implemented in Java and runs on top of the Informix Universal
Server. In this paper we present the architecture of MOCHA, the ideas
behind it, and a performance study using data and queries from the Sequoia
2000 Benchmark. The results of this study demonstrate that MOCHA not only
provides a flexible and scalable framework but also substantially improves
query performance in contrast to traditional middleware solutions.
(Also cross-referenced as UMIACS-TR-98-67)