Show simple item record

Data Centric Cache Measurement Using Hardware and Software Instrumentation

dc.contributor.advisorHollingsworth, Jeffrey Ken_US
dc.contributor.authorBuck, Bryan Rogeren_US
dc.description.abstractThe speed at which microprocessors can perform computations is increasing faster than the speed of access to main memory, making efficient use of memory caches ever more important. Because of this, information about the cache behavior of applications is valuable for performance tuning. To be most useful to a programmer, this information should be presented in a way that relates it to data structures at the source code level; we will refer to this as data centric cache information. This disser-tation examines the problem of how to collect such information. We describe tech-niques for accomplishing this using hardware performance monitors and software in-strumentation. We discuss both performance monitoring features that are present in existing processors and a proposed feature for future designs. The first technique we describe uses sampling of cache miss addresses, relat-ing them to data structures. We present the results of experiments using an imple-mentation of this technique inside a simulator, which show that it can collect the de-sired information accurately and with low overhead. We then discuss a tool called Cache Scope that implements this on actual hardware, the Intel Itanium 2 processor. Experiments with this tool validate that perturbation and overhead can be kept low in a real-world setting. We present examples of tuning the performance of two applica-tions based on data from this tool. By changing only the layout of data structures, we achieved approximately 24% and 19% reductions in running time. We also describe a technique that uses a proposed hardware feature that pro-vides information about cache evictions to sample eviction addresses. We present results from an implementation of this technique inside a simulator, showing that even though this requires storing considerably more data than sampling cache misses, we are still able to collect information accurate enough to be useful while keeping overhead low. We discuss an example of performance tuning in which we were able to reduce the running time of an application by 8% using information gained from this tool.en_US
dc.format.extent710011 bytes
dc.titleData Centric Cache Measurement Using Hardware and Software Instrumentationen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.contributor.departmentComputer Scienceen_US
dc.subject.pqcontrolledComputer Scienceen_US

Files in this item


This item appears in the following Collection(s)

Show simple item record