Harnessing Checker Hierarchy for Reliable Microprocessors

dc.contributor.advisorFranklin, Manojen_US
dc.contributor.authorYoo, Joonhyuken_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.description.abstractTraditional fault-tolerant multi-threading architectures provide good fault tolerance by re-executing all the computations. However, such a full re-execution significantly increases the demand on the processor resources, resulting in severe performance degradation. To address this problem, this dissertation presents Active Verification Management (AVM) approaches that utilize a checker hierarchy to increase its performance with a minimal effect on the overall reliability. Based on a simplified queueing model, AVM employs a filter checker which prioritizes the verification candidates to selectively do verification. This dissertation proposes three filter checkers - based on (1) result usage, (2) result bitwidth, and (3) result anomaly - that exploit correctness-criticality metrics and anomaly speculation. Binary Correctness Criticality (BCC) and Likelihood of Correctness Criticality (LoCC) are metrics that quantify whether an instruction is important for reliability or how likely an instruction is correctness-critical, respectively. Based on the BCC, a result-usage-based filter checker mitigates the verification workload by bypassing instructions that are unnecessary for correct execution. Computing the LoCC is accomplished by exploiting information redundancy of compressing computationally useful data bits. Numerical significance hints let the result-bitwidth-based filter checker guide a verification priority effectively before the re-execution process starts. A result-anomaly-based filter checker exploits a value similarity property, which is defined by a frequent occurrence of partially identical values. Based on the biased distribution of similarity distance measure, this dissertation further investigates another application to exploit similar values for soft error tolerance with anomaly speculation. Extensive measurements show that the majority of instructions produce values that are different from the previous result value only in a few bits. Experimental results show that the proposed schemes accelerate the processor to be 180% faster than traditional fully-fault-tolerant processor, with a minimal impact on the overall soft error rate. With no AVM, congestion at the checker badly affects performance, to the tune of 57%, when compared to that of a non-fault-tolerant processor. These results explain that the proposed AVM has the potential to solve the verification congestion problem when perfect fault coverage is not needed.en_US
dc.format.extent1231091 bytes
dc.subject.pqcontrolledEngineering, Electronics and Electricalen_US
dc.subject.pqcontrolledComputer Scienceen_US
dc.subject.pqcontrolledEngineering, Electronics and Electricalen_US
dc.subject.pquncontrolledfault-tolerant computer architecture;active verification managementen_US
dc.subject.pquncontrolledfilter checkeren_US
dc.subject.pquncontrolledcorrectness criticalityen_US
dc.subject.pquncontrolledanomaly speculationen_US
dc.titleHarnessing Checker Hierarchy for Reliable Microprocessorsen_US
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.17 MB
Adobe Portable Document Format