Efficient Execution of Multi-Query Data Analysis Batches Using Compiler Optimization Strategies

Loading...
Thumbnail Image

Files

CS-TR-4507.ps (537.5 KB)
No. of downloads: 234
CS-TR-4507.pdf (234.57 KB)
No. of downloads: 1170

Publication or External Link

Date

2003-08-01

Advisor

Citation

DRUM DOI

Abstract

This work investigates the leverage that can be obtained from compiler optimization techniques for efficient execution of multi-query workloads in data analysis applications. Our approach is to address multi-query optimization at the algorithmic level by transforming a declarative specification of scientific data analysis queries into a high-level imperative program that can be made more efficient by applying compiler optimization techniques. These techniques -- including loop fusion, common subexpression elimination and dead code elimination -- are employed to allow data and computation reuse across queries. We describe a preliminary experimental analysis on a real remote sensing application that is used to analyze very large quantities of satellite data. The results show our techniques achieve sizable reduction in the amount of computation and I/O necessary for executing query batches and in average executing times for the individual queries in a given batch. (UMIACS-TR-2003-76)

Notes

Rights