Optimization within a Unified Transformation Framework

Thumbnail Image
CS-TR-3725.ps(805.21 KB)
No. of downloads: 411
CS-TR-3725.pdf(518.23 KB)
No. of downloads: 1019
Publication or External Link
Kelly, Wayne
Programmers typically want to write scientific programs in a high level language with semantics based on a sequential execution model. To execute efficiently on a parallel machine, however, a program typically needs to contain explicit parallelism and possibly explicit communication and synchronization. So, we need compilers to convert programs from the first of these forms to the second. There are two basic choices to be made when parallelizing a program. First, the computations of the program need to be distributed amongst the set of available processors. Second, the computations on each processor need to be ordered. My contribution has been the development of simple mathematical abstractions for representing these choices and the development of new algorithms for making these choices. I have developed a new framework that achieves good performance by minimizing communication between processors, minimizing the time processors spend waiting for messages from other processors, and ordering data accesses so as to exploit the memory hierarchy. This framework can be used by optimizing compilers, as well as by interactive transformation tools. The state of the art for vectorizing compilers is already quite good, but much work remains to bring parallelizing compilers up to the same standard. The main contribution of my work can be summarized as improving this situation by replacing existing ad hoc parallelization techniques with a sound underlying foundation on which future work can be built. (Also cross-referenced as UMIACS-TR-96-93)