Technology Implications for Large Last-Level Caches

dc.contributor.advisorJacob, Bruceen_US
dc.contributor.authorChang, Mu-Tienen_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2014-02-06T06:31:13Z
dc.date.available2014-02-06T06:31:13Z
dc.date.issued2013en_US
dc.description.abstractLarge last-level cache (L3C) is efficient for bridging the performance and power gap between processor and memory. Several memory technologies, including SRAM, STT-RAM (MRAM), and embedded DRAM (eDRAM), have been used or considered as the technology to implement L3Cs. However, each of them has inherent weaknesses: SRAM is relatively low density and dissipates high leakage; STT-RAM has long write latency and requires high write energy; eDRAM requires refresh. As future processors are expected to have larger last-level caches, the objective of this dissertation is to study the tradeoffs associated with using each of these technologies to implement L3Cs. In order to make useful comparisons between L3Cs built with SRAM, STT-RAM, and eDRAM, we consider and implement several levels of details. First, to obtain unbiased cache performance and power properties (i.e., read/write access latency, read/write access energy, leakage power, refresh power, area), we prototype caches based on realistic memory and device models. Second, we present simplistic analytical models that enable us to quickly examine different memory technologies under various scenarios. Third, we review power-optimization techniques for each of the technologies, and propose using a low-cost dead-line prediction scheme for eDRAM-based L3Cs to eliminate unnecessary refreshes. Finally, the highlight of this dissertation is the comparison and analysis of low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM. We report system performance, last-level cache energy breakdown, and memory hierarchy energy breakdown, using an augmented full-system simulator with the execution of a range of workloads and input sets. From the insights gained through simulation results, STT-RAM has the highest potential to save energy in future L3C designs. For contemporary processors, SRAM-based L3C results in the fastest system performance, whereas eDRAM consumes the lowest energy.en_US
dc.identifier.urihttp://hdl.handle.net/1903/14837
dc.language.isoenen_US
dc.subject.pqcontrolledComputer engineeringen_US
dc.titleTechnology Implications for Large Last-Level Cachesen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chang_umd_0117E_14804.pdf
Size:
2.39 MB
Format:
Adobe Portable Document Format