Reducing the Soft Error Rates of a High-Performance Microprocessor Using Front-End Throttling

Thumbnail Image
umi-umd-3651.pdf(372.46 KB)
No. of downloads: 1522
Publication or External Link
Kalappurakkal, Smitha Menon
Franklin, Manoj
Microprocessors are increasingly used in a variety of applications from small handheld calculators to multi-million dollar servers. With our increasing dependence on microprocessor-based systems, greater importance needs to be given not only to a processor's performance but also to its dependability. With each new technology generation, we witness a consistent increase in cosmically induced soft errors per chip. Earlier, only memory structures were affected significantly by soft errors. But today, most memories are well protected by error detection and correction codes. However, unprotected logic state elements, which were not a great concern in older technology generations, are increasingly becoming a concern due to the technology scaling trend. Although the fault rate per transistor has been remaining roughly the same across generations, the increasing number of transistors per chip is resulting in a steady increase in the raw error rates. Thus, the increased functionality and performance as dictated by Moore's Law comes at the cost of an exponentially increasing soft error rate. In our work, we present techniques to reduce a processor's soft error rate. We focus on one of the major contributors of the on-chip soft error rates - the Instruction Issue Queue(IQ), which is proven to have a significantly higher vulnerablility factor (32.7% as measured by our work) compared to other microarchitectural struc- tures like the register file (18.65%), re-order buffer (28%) and execution units (9%). Modern processors often aggressively fetch and decode instructions, in order to exploit as much parallelism as possible. However, this often results in instructions being fetched much earlier than necessary, causing valid instructions to reside in the vulnerable IQ for many needless cycles, waiting for dependencies to be resolved or to be squashed on a mis-speculation event. Additional ILP that may get exposed as a result of this aggressive front-end design, often does not result in any major performance benefits. We exploit this inefficiency by slowing down the front-end of the pipeline when it is not likely to affect the performance significantly. For this, we explore a set of reliability-aware front-end throttling schemes, to bring down the utilization of the IQ, and hence the soft error rates.