Learning Response Time for WebSources using Query Feedback and
Application in Query Optimization
Files
Publication or External Link
Date
Advisor
Citation
DRUM DOI
Abstract
The rapid growth of the Internet and support for interoperability
protocols has increased the number of Web accessible sources, WebSources.
Current optimization technology for wrapper mediator architectures
needs to be extended to estimate the response time (delays) to access
WebSources and to use this delay in query optimization.
In this paper, we present a Multi-Dimensional Table (MDT), a tool that
is based on learning using query feedback from WebSources.
We describe the MDT learning algorithms, and report on the MDT learning for
WebSources. The MDT uses dimensions Time of day, Day, and Quantity of data, to learn
response times from a particular WebSource, and to predict the expected
response time (delay), and a confidence in this prediction, for some query.
Experiment data was collected from several WebSources and analyzed, to
determine those dimensions that were significant in estimating the response time
for particular WebSources. Our research shows that we can improve the quality of learning by
tuning the MDT features, e.g., including significant dimensions in the MDT,
or changing the ordering of dimensions. We then demonstrate how the MDT
prediction of delay may be used by a scrambling enabled optimizer.
A scrambling algorithm identifies some critical points of delay, where it makes
a decision to scramble (modify) a plan, to attempt to hide the expected delay
by computing some other part of the plan that is unaffected by the delay.
We explore the space of real delay at a WebSource, versus the MDT prediction of
this delay, with respect to critical points of delay in specific plans.
We identify those cases where MDT overestimation or underestimation of the
real delay results in a penalty in the scrambling enabled optimizer, and
those cases where there is no penalty. Using the experimental
data and MDT learning, we test how good the MDT is in minimizing these penalties.
Also cross-referenced as UMIACS TR #98-64