EVALUATING THE IMPACT OF MEMORY SYSTEM PERFORMANCE ON SOFTWARE PREFETCHING AND LOCALITY OPTIMIZATIONS

Loading...
Thumbnail Image

Files

CS-TR-4392.pdf (569.22 KB)
No. of downloads: 1200

Publication or External Link

Date

2002-09-18

Advisor

Citation

DRUM DOI

Abstract

Software prefetching and locality optimizations are two techniques for overcoming the speed gap between processor and memory known as the memory wall as suggested by Wulf and Mckee. This thesis evaluates the impact of memory trends on the effectiveness of software prefetching and locality optimizations for three types of applications: regular scientific codes, irregular scientific
codes, and pointer-chasing codes. For many applications, software prefetching outperforms locality optimizations when there is sufficient bandwidth in the underlying memory system, but locality optimizations outperform software prefetching when the underlying memory system doesn't provide sufficient bandwidth. The break-even point, or equivalently the crossover bandwidth point, occurs at
roughly 2.4 GBytes/sec , for 1 GHz processors on today's memory systems, and will increase on future memory systems. This thesis also studies the interactions between software prefetching and locality optimizations when applied in concert. Naively combining the two techniques provides a more robust application performance in the face of variations in memory bandwidth and/or latency, but does not yield additional performance gains. In other words, the
performance won't be better than the best performance of the two techniques alone. Also, several algorithms are proposed and evaluated to better combine software prefetching and locality
optimizations, including an enhanced tiling algorithm, padding for software prefetching, and index prefetching. (Also UMIACS-TR-2002-72)

Notes

Rights