We are in the process of updating the DRUM statistics and the number of downloads reported in DRUM records only reflects downloads from June 2014 to the present. The previous numbers have not been lost and we are in the process adding them to the total. Please contact firstname.lastname@example.org if you have any questions.
Buffer-On-Board Memory System
MetadataShow full item record
The design and implementation of the commodity memory architecture has resulted in significant limitations in a system's speed and capacity. To circumvent these limitations, designers and vendors have begun to place intermediate logic between the CPU and DRAM. This additional logic has two functions: to control the DRAM and to communicate with the CPU over a fast and narrow bus. The benefit provided by this logic is a reduction in pin-out to the memory system from the CPU and increased signal integrity seen by the DRAM, granting faster clock rates while increasing capacity. This new design is reminiscent of the FB-DIMM memory system yet makes key changes to its architecture including the use of existing DIMMs to reduce cost, a reduction in power (relative to FB-DIMM), and a more stable request latency. The problem is that the few vendors utilizing this design have the same general approach, yet the implementations vary greatly in their non-trivial details. A hardware verified simulation suite is developed to accurately model and evaluate the behavior of this buffer-on-board memory system. A study of this design space is performed to determine optimal use of the resources involved. This includes DRAM and bus organization, queue storage, and mapping schemes. Various constraints based on implementation costs are placed on simulated configurations to confirm that these optimizations apply to viable systems. Finally, full system simulations are performed with MARSSx86 to better understand how this memory system interacts with a CPU, cache, and operating system executing an application. Full system simulations uncover behaviors not present in simple limit-case simulations such as the impact of address and channel mapping schemes or the organization of ports and the associated buffers. When applying insights gleaned from these simulations, optimal performance can be achieved while still considering outside constraints (i.e., pin-out, power, and fabrication costs).