Using Program Slicing to Drive Pre-Execution on Simultaneous Multithreading Processors

Loading...
Thumbnail Image

Files

CS-TR-4268.ps (884.98 KB)
No. of downloads: 267
CS-TR-4268.pdf (201.06 KB)
No. of downloads: 837

Publication or External Link

Date

2001-11-21

Advisor

Citation

DRUM DOI

Abstract

Pre-execution uses helper threads running in spare hardware contexts to trigger cache missesin front of the main thread, hence hiding their latency. At the heart of pre-execution is the code that runs in the pre-execution threads themselves. The most common approach is f or pre-execution threads to run a subset of the instructions executed by the ori ginal program, called backward slices [18], which are extracted from the main th read at the instruction level.This paper proposes a new pre-execution technique that uses program slicing [2] to extract the code for pre-execution threads. Pro gram slicing performs static analysis on the programsource to create slices consisting of source code rather than binary code. Compared to previous techniques, our approach requires less hardware, and is more natural to automate in a com-pi ler. To study the feasibility of our approach, we built a slicing system based o n a publicly available program slicer, called Unravel, that constructs program s lices for pre-execution. Wealso developed several program slice parallelization techniques that partition our program slices onto multiple pre-execution threads . Our techniques enable pre-execution threads to effectivelyget ahead of the mai n thread by exploiting thread-level parallelism. Finally, our work provides an e valuation of program slice driven pre-execution using a detailed simulator of a simultane-ous multithreading (SMT) processor. Our techniques achieve a 27.4% speedup across 7 integer applications on an 8-way SMT with 4 contexts, and a 56.7% speedup on an SMT with 9 contexts. (Also referenced as UMIACS-TR-2001-49)

Notes

Rights