EVALUATING THE IMPACT OF MEMORY SYSTEM PERFORMANCE ON SOFTWARE PREFETCHING AND LOCALITY OPTIMIZATIONS
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Software prefetching and locality optimizations are two techniques for
overcoming the speed gap between processor and memory known as the
memory wall as suggested by Wulf and Mckee. This
thesis evaluates the impact of memory trends on the effectiveness
of software prefetching and locality optimizations for three types
of applications: regular scientific codes, irregular scientific
codes, and pointer-chasing codes. For many applications, software
prefetching outperforms locality optimizations when there is
sufficient bandwidth in the underlying memory system, but locality
optimizations outperform software prefetching when the underlying
memory system doesn't provide sufficient bandwidth. The break-even
point, or equivalently the crossover bandwidth point, occurs at
roughly 2.4 GBytes/sec , for 1 GHz processors on today's memory systems, and will increase on future memory systems. This thesis
also studies the interactions between software prefetching and
locality optimizations when applied in concert. Naively combining
the two techniques provides a more robust application performance
in the face of variations in memory bandwidth and/or latency, but
does not yield additional performance gains. In other words, the
performance won't be better than the best performance of the two
techniques alone. Also, several algorithms are proposed and
evaluated to better combine software prefetching and locality
optimizations, including an enhanced tiling algorithm, padding for software prefetching, and index prefetching.
(Also UMIACS-TR-2002-72)