Organizational Design Trade-Offs at the DRAM, Memory Bus, and Memory Controller Level: Initial Results

Cuppu, Vinodh; Jacob, Bruce

Organizational Design Trade-Offs at the DRAM, Memory Bus, and Memory Controller Level: Initial Results

dc.contributor.author	Cuppu, Vinodh
dc.contributor.author	Jacob, Bruce
dc.date.accessioned	2007-10-25T18:39:50Z
dc.date.available	2007-10-25T18:39:50Z
dc.date.issued	1999-11
dc.description.abstract	This paper presents initial results in a study of organization level parameters associated with the design of the primary memory system—the DRAM system beneath the lowest level of the cache hierarchy. These parameters are orthogonal to architecture-level parameters such as DRAM core speed, bus arbitration protocol, etc. and include bus width, bus speed, number of independent channels, degree of banking, read burst width, write burst width, etc; this study presents the effective cross-product of varying each of these parameters independently. The simulator is based on SimpleScalar 3.0a and models a fast (simulated as 2GHz), highly aggressive out-of-order uniprocessor. The interface to the primary memory system is fully non-blocking, supporting up to 32 outstanding misses at both the level-1 and level-2 caches. Our simulations show the following: (a) the choice of primary memory-system organization is critical, as it can effect total execution time by a factor of 3x for a constant CPU organization and DRAM speed; (b) the most important factors in the performance of the primary memory system are the channel speed (bus cycle time) and the granularity of data access, the burst width—each of these can independently affect total execution time by a factor of 2x; (c) for small bursts, multiple narrow independent channels to the memory system exhibit better performance than a single wide channel; for large bursts, channel cycle time is the most important factor; (d) the degree of DRAM multi-banking plays a secondary role in its impact on total execution time; (e) the optimal burst width tends to be high (large enough to fetch an L2 cache block in 2 bursts) and scales with the block size of the level 2 cache; and (f) the memory queue sizes can be extremely large, due to the bursty nature of references to the primary memory system and the promotion of reads ahead of writes. Among other things, we conclude that the scheduling of the memory bus is the primary bottleneck and that it should be the focus of further study.	en
dc.format.extent	144397 bytes
dc.format.mimetype	application/pdf
dc.identifier.citation	"Organizational design trade-offs at the DRAM, memory bus, and memory controller level: Initial results." Vinodh Cuppu and Bruce Jacob. University of Maryland Systems and Computer Architecture Group Technical Report UMD-SCA-TR-1999-2. November 1999.	en
dc.identifier.uri	http://hdl.handle.net/1903/7439
dc.language.iso	en_US	en
dc.relation.isAvailableAt	A. James Clark School of Engineering	en_us
dc.relation.isAvailableAt	Electrical & Computer Engineering	en_us
dc.relation.isAvailableAt	Digital Repository at the University of Maryland	en_us
dc.relation.isAvailableAt	University of Maryland (College Park, MD)	en_us
dc.subject	DRAM	en
dc.subject	bus arbitration protocol	en
dc.title	Organizational Design Trade-Offs at the DRAM, Memory Bus, and Memory Controller Level: Initial Results	en
dc.type	Technical Report	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: UMD-SCA-TR-1999-2.pdf
Size:: 141.01 KB
Format:: Adobe Portable Document Format

Download

Collections

Electrical & Computer Engineering Research Works