Last Level Cache (LLC) Performance of Data Mining Workloads On a CMP — Case Study of Parallel Bioinformatics Workloads
dc.contributor.author | Jaleel, Aamer | |
dc.contributor.author | Mattina, Matthew | |
dc.contributor.author | Jacob, Bruce | |
dc.date.accessioned | 2007-11-08T18:43:32Z | |
dc.date.available | 2007-11-08T18:43:32Z | |
dc.date.issued | 2006-02 | |
dc.description.abstract | With the continuing growth in the amount of genetic data, members of the bioinformatics community are developing a variety of data-mining applications to understand the data and discover meaningful information. These applications are important in defining the design and performance decisions of future high performance microprocessors. This paper presents a detailed data-sharing analysis and chip-multiprocessor (CMP) cache study of several multithreaded data-mining bioinformatics workloads. For a CMP with a three-level cache hierarchy, we model the last-level of the cache hierarchy as either multiple private caches or a single cache shared amongst different cores of the CMP. Our experiments show that the bioinformatics workloads exhibit significant data-sharing—50–95% of the data cache is shared by the different threads of the workload. Furthermore, regardless of the amount of data cache shared, for some workloads, as many as 98% of the accesses to the last-level cache are to shared data cache lines. Additionally, the amount of data-sharing exhibited by the workloads is a function of the total cache size available—the larger the data cache the better the sharing behavior. Thus, partitioning the available last-level cache silicon area into multiple private caches can cause applications to lose their inherent data-sharing behavior. For the workloads in this study, a shared 32MB last-level cache is able to capture a tremendous amount of data-sharing and outperform a 32MB private cache configuration by several orders of magnitude. Specifically, with shared last-level caches, the bandwidth demands beyond the last-level cache can be reduced by factors of 3–625 when compared to private last-level caches. | en |
dc.format.extent | 377065 bytes | |
dc.format.mimetype | application/pdf | |
dc.identifier.citation | "Last-level cache (LLC) performance of data-mining workloads on a CMP--A case study of parallel bioinformatics workloads." Aamer Jaleel, Matthew Mattina, and Bruce Jacob. Proc. 12th International Symposium on High Performance Computer Architecture (HPCA 2006), Austin TX, February 2006. | en |
dc.identifier.uri | http://hdl.handle.net/1903/7453 | |
dc.language.iso | en_US | en |
dc.relation.isAvailableAt | A. James Clark School of Engineering | en_us |
dc.relation.isAvailableAt | Electrical & Computer Engineering | en_us |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | en_us |
dc.relation.isAvailableAt | University of Maryland (College Park, MD) | en_us |
dc.subject | bioinformatics | en |
dc.subject | chip-multiprocessor (CMP) | en |
dc.subject | cache hierarchy | en |
dc.title | Last Level Cache (LLC) Performance of Data Mining Workloads On a CMP — Case Study of Parallel Bioinformatics Workloads | en |
dc.type | Presentation | en |
Files
Original bundle
1 - 1 of 1