Technology Implications for Large Last-Level Caches
dc.contributor.advisor | Jacob, Bruce | en_US |
dc.contributor.author | Chang, Mu-Tien | en_US |
dc.contributor.department | Electrical Engineering | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2014-02-06T06:31:13Z | |
dc.date.available | 2014-02-06T06:31:13Z | |
dc.date.issued | 2013 | en_US |
dc.description.abstract | Large last-level cache (L3C) is efficient for bridging the performance and power gap between processor and memory. Several memory technologies, including SRAM, STT-RAM (MRAM), and embedded DRAM (eDRAM), have been used or considered as the technology to implement L3Cs. However, each of them has inherent weaknesses: SRAM is relatively low density and dissipates high leakage; STT-RAM has long write latency and requires high write energy; eDRAM requires refresh. As future processors are expected to have larger last-level caches, the objective of this dissertation is to study the tradeoffs associated with using each of these technologies to implement L3Cs. In order to make useful comparisons between L3Cs built with SRAM, STT-RAM, and eDRAM, we consider and implement several levels of details. First, to obtain unbiased cache performance and power properties (i.e., read/write access latency, read/write access energy, leakage power, refresh power, area), we prototype caches based on realistic memory and device models. Second, we present simplistic analytical models that enable us to quickly examine different memory technologies under various scenarios. Third, we review power-optimization techniques for each of the technologies, and propose using a low-cost dead-line prediction scheme for eDRAM-based L3Cs to eliminate unnecessary refreshes. Finally, the highlight of this dissertation is the comparison and analysis of low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM. We report system performance, last-level cache energy breakdown, and memory hierarchy energy breakdown, using an augmented full-system simulator with the execution of a range of workloads and input sets. From the insights gained through simulation results, STT-RAM has the highest potential to save energy in future L3C designs. For contemporary processors, SRAM-based L3C results in the fastest system performance, whereas eDRAM consumes the lowest energy. | en_US |
dc.identifier.uri | http://hdl.handle.net/1903/14837 | |
dc.language.iso | en | en_US |
dc.subject.pqcontrolled | Computer engineering | en_US |
dc.title | Technology Implications for Large Last-Level Caches | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1