A Performance Prediction Framework for Data Intensive Applications on Large Scale Parallel Machines

View/ Open
Date
1998-10-15Author
Uysal, Mustafa
Kurc, Tahsin M.
Sussman, Alan
Saltz, Joel
Metadata
Show full item recordAbstract
This paper presents a simulation-based performance prediction
framework for large scale data-intensive applications on large scale
machines. Our framework consists of two components: application
emulators and a suite of simulators. Application emulators provide a
parameterized model of data access and computation patterns of the
applications and enable changing of critical application components
(input data partitioning, data declustering, processing structure,
etc.) easily and flexibly. Our suite of simulators model the I/O and
communication subsystems with good accuracy and execute quickly on a
high-performance workstation to allow performance prediction of large
scale parallel machine configurations. The key to efficient
simulation of very large scale configurations is a technique called
loosely-coupled simulation where the processing structure of the
application is embedded in the simulator, while preserving data
dependencies and data distributions. We evaluate our performance
prediction tool using a set of three data-intensive applications.
(Also cross-referenced as UMIACS TR # 98-39)