Electrical & Computer Engineering Research Works

Permanent URI for this collectionhttp://hdl.handle.net/1903/1658

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    DDR2 and Low Latency Variants
    (2000-07) Davis, Brian; Mudge, Trevor; Jacob, Bruce; Cuppu, Vinodh
    This paper describes a performance examination of the DDR2 DRAM architecture and the proposed cache-enhanced variants. These preliminary studies are based upon ongoing collaboration between the authors and the Joint Electronic Device Engineering Council (JEDEC) Low Latency DRAM Working Group, a working group within the JEDEC 42.3 Future DRAM Task Group. This Task Group is responsible for developing the DDR2 standard. The goal of the Low Latency DRAM Working Group is the creation of a single cache-enhanced (i.e. low-latency) architecture based upon this same interface. There are a number of proposals for reducing the average access time of DRAM devices, most of which involve the addition of SRAM to the DRAM device. As DDR2 is viewed as a future standard, these proposals are frequently applied to a DDR2 interface device. For the same reasons it is advantageous to have a single DDR2 specification, it is similarly beneficial to have a single low-latency specification. The authors are involved in ongoing research to evaluate which enhancements to the baseline DDR2 devices will yield lower average latency, and for what type of applications. To provide context, experimental results will be compared against those for systems utilizing PC100 SDRAM, DDR133 SDRAM, and Direct Rambus (DRDRAM). This work is just starting to produce performance data. Initial results show performance improvements for low-latency devices that are significant, but less so than a generational change in DRAM interface. It is also apparent that there are at least two classifications of applications: 1) those that saturate the memory bus, for which performance is dependent upon the potential bandwidth and bus utilization of the system; and 2) those that do not contain the access parallelism to fully utilize the memory bus, and for which performance is dependent upon the latency of the average primary memory access.
  • Item
    Concurrency, Latency, or System Overhead: Which Has the Largest Impact on Uniprocessor DRAM-System Performance?
    (2001-06) Cuppu, Vinodh; Jacob, Bruce
    Given a fixed CPU architecture and a fixed DRAM timing specification, there is still a large design space for a DRAM system organization. Parameters include the number of memory channels, the bandwidth of each channel, burst sizes, queue sizes and organizations, turnaround overhead, memory-controller page protocol, algorithms for assigning request priorities and scheduling requests dynamically, etc. In this design space, we see a wide variation in application execution times; for example, execution times for SPEC CPU 2000 integer suite on a 2-way ganged Direct Rambus organization (32 data bits) with 64-byte bursts are 10–20% lower than execution times on an otherwise identical configuration that uses 32-byte bursts. This represents two system configurations that are relatively close to each other in the design space; performance differences become even more pronounced for designs further apart. This paper characterizes the sources of overhead in high-performance DRAM systems and investigates the most effective ways to reduce a system’s exposure to performance loss. In particular, we look at mechanisms to increase a system’s support for concurrent transactions, mechanisms to reduce request latency, and mechanisms to reduce the “system overhead”—the portion of the primary memory system’s overhead that is not due to DRAM latency but rather to things like turnaround time, request queueing, inefficiencies due to read/write request interleaving, etc. Our simulator models a 2GHz, highly aggressive out-of-order uniprocessor. The interface to the memory system is fully non-blocking, supporting up to 32 outstanding misses at both the level-1 and level-2 caches and split-transaction busses to all DRAM banks.
  • Item
    A Case for Studying DRAM Issues at the System Level
    (IEEE Micro, 2003-08) Jacob, Bruce
    THE WIDENING GAP BETWEEN TODAY’S PROCESSOR AND MEMORY SPEEDS MAKES DRAM SUBSYSTEM DESIGN AN INCREASINGLY IMPORTANT PART OF COMPUTER SYSTEM DESIGN. IF THE DRAM RESEARCH COMMUNITY WOULD FOLLOW THE MICROPROCESSOR COMMUNITY’S LEAD BY LEANING MORE HEAVILY ON ARCHITECTURE- AND SYSTEM-LEVEL SOLUTIONS IN ADDITION TO TECHNOLOGY-LEVEL SOLUTIONS TO ACHIEVE HIGHER PERFORMANCE, THE GAP MIGHT BEGIN TO CLOSE.
  • Item
    DRAMsim: A Memory System Simulator
    (ACM (Association for Computing Machinery) Publications, 2005-09) Wang, David; Ganesh, Brinda; Tuaycharoen, Nuengwong; Baynes, Kathleen; Jaleel, Aamer; Jacob, Bruce
    As memory accesses become slower with respect to the processor and consume more power with increasing memory size, the focus of memory performance and power consumption has become increasingly important. With the trend to develop multi-threaded, multi-core processors, the demands on the memory system will continue to scale. However, determining the optimal memory system configuration is non-trivial. The memory system performance is sensitive to a large number of parameters. Each of these parameters take on a number of values and interact in fashions that make overall trends difficult to discern. A comparison of the memory system architectures becomes even harder when we add the dimensions of power consumption and manufacturing cost. Unfortunately, there is a lack of tools in the public-domain that support such studies. Therefore, we introduce DRAMsim, a detailed and highly configurable C-based memory system simulator to fill this gap. DRAMsim implements detailed timing models for a variety of existing memories, including SDRAM, DDR, DDR2, DRDRAM and FB-DIMM, with the capability to easily vary their parameters. It also models the power consumption of SDRAM and its derivatives. It can be used as a standalone simulator or as part of a more comprehensive system-level model. We have successfully integrated DRAMsim into a variety of simulators including MASE[15], Sim-alpha[14], BOCHS[2] and GEMS[13]. The simulator can be downloaded from www.ece.umd.edu/dramsim.