Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies
Files
Publication or External Link
Date
Advisor
Citation
DRUM DOI
Abstract
This work investigates the leverage that can be obtained from compiler optimization techniques for efficient execution of multi-query workloads in data analysis applications. Our approach is to address multi-query optimization at the algorithmic level by transforming a declarative specification of scientific data analysis queries into a high-level imperative program that can be made more efficient by applying compiler optimization techniques. These techniques -- including loop fusion, common subexpression elimination and dead code elimination -- are employed to allow data and computation reuse across queries. We describe a preliminary experimental analysis on a real remote sensing application that is used to analyze very large quantities of satellite data. The results show our techniques achieve sizable reduction in the amount of computation and I/O necessary for executing query batches and in average executing times for the individual queries in a given batch. (UMIACS-TR-2003-76)