Real-Time Algorithm-Based Fault-Tolerance for QRD Recursive Least-Squares Systolic Array: A Graceful Degradation Approach

Loading...
Thumbnail Image

Files

TR_91-10.pdf (1.36 MB)
No. of downloads: 518

Publication or External Link

Date

1991

Advisor

Citation

DRUM DOI

Abstract

In this paper, we propose a new algorithm-based fault-tolerant method derived from the inherent nature of the QR lease-squares systolic algorithm. Since the residuals of different desired responses can be computed simultaneously, an artificial desired response can be designed to detect an error produced by a faulty processor. We show that if the artificial desired response is designed as some proper combinations of the input data, the output residual of the system will be zero if there is no fault. However, any occurring fault in the system will cause the residual to be non-zero and the fault can be detected in realtime. Once the fault has been detected, the system enters into the fault diagnosis phase from the concurrent error detection phase. Two methods, the flushing fault location and the checksum encoding methods, can be used to diagnose the location of the faulty row. When the faulty row is determined, this row and the associated column with the same boundary cell are eliminated by a reconfiguration operation. Then the system degrades in a graceful manner which is generally acceptable for many least-squares applications. Those eliminated processors enter into a self-checking phase, and when the transient fault condition is removed, a reconfiguration is performed to resume the normal full order operation. The analysis of error propagation and recovery latency is also considered in this paper.

Notes

Rights