TERPS: The Embedded Reliable Processing System
Gole, Amol Vishwas
Jacob, Bruce L
MetadataShow full item record
Electromagnetic Interference (EMI) can have an adverse effect on commercial electronics. As feature sizes of integrated circuits become smaller, their susceptibility to EMI increases. In light of this, integrated circuits will face substantial problems in the future either from electromagnetic disturbances or intentionally generated EMI from a malicious source. The Embedded Reliable Processing System (TERPS) is a fault tolerant system architecture which can significantly reduce the threat of EMI in computer systems. TERPS employs a checkpoint and rollback recovery mechanism tied with a multi-phase commit protocol and 3D IC technology. This enables it to recover from substantial EMI without having to shutdown or reboot. In the face of such EMI, only a loss in performance dictated by the strength and duration of the interference and the frequency of checkpointing will be seen. Various conditions in which chips can fail under the influence of EMI are described. The checkpoint and rollback recovery mechanism and the resulting TERPS architecture is stipulated. A thorough evaluation of the design correctness is provided. The technique is implemented in Verilog HDL using a 16-bit, 5-stage pipelined processor to show proof of concept. The performance overhead is calculated for different checkpointing intervals and is shown to be very reasonable (5-6% for checkpointing every 128 CPU cycles).