Institute for Systems Research Technical Reports

Permanent URI for this collectionhttp://hdl.handle.net/1903/4376

This archive contains a collection of reports generated by the faculty and students of the Institute for Systems Research (ISR), a permanent, interdisciplinary research unit in the A. James Clark School of Engineering at the University of Maryland. ISR-based projects are conducted through partnerships with industry and government, bringing together faculty and students from multiple academic departments and colleges across the university.

Browse

Search Results

Now showing 1 - 9 of 9
  • Thumbnail Image
    Item
    VLSI Algorithms and Architectures for Complex Householder Transformation with Applications to Array Processing
    (1991) Tang, C.F.T.; Liu, K.J. Ray; Hsieh, S.F.; Yao, K.; ISR
    The Householder transformation is considered to be desirable among various unitary transformations due to its superior computational efficiency and robust numerical stability. Specifically, the Householder transformation outperforms the Givens rotation and the modified Gram-Schmidt methods in numerical stability under finite-precision implementations, as well as requiring fewer arithmetical operations. Consequently, the QR decomposition based on the Householder transformation is promising for VLSI implementation and real-time high throughput modern signal processing. In this paper, a recursive complex Householder transformation with a fast initialization algorithm is proposed and its associated parallel/pipelined architecture is also considered. Then, a complex Householder transformation based recursive least-squares algorithm with a fast initialization is presented. Its associated systolic array processing architecture is also considered.
  • Thumbnail Image
    Item
    A Unified Approach for QRD-Based Recursive Least-Squres Estimation without Square Roots
    (1991) Hsieh, S.F.; Liu, K.J. Ray; Yao, K.; ISR
    The QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective towards adaptive signal processing in modern communications, radar, and sonar systems implementable with various modern parallel and systolic array architectures. The planar (Givens) and hyperbolic rotations are the most commonly used methods in performing the QRD up/downdating. But the generic formula for these rotations require explicit square-root (sqrt) computations, which constitute the computational bottleneck and are quite undesirable from the practical VLSI circuit design point of view. There has been more than ten sqrt-free algorithms known so far. In this paper, we provide a unified systematic approach for the sqrt-free QRD-based RLS estimation problem. By properly choosing two parameters, and v, all existing known sqrt-free methods fall in the category of our unified approach. The proposed method not only can generalize all currently known sqrt-free QRD algorithms, but also new sqrt-free algorithms as long as the parameters and v are properly chosen. The unified treatment is also extended to the QRD-based RLS problems for optimal residual acquisition without sqrt operations, and the systolic array implementation.
  • Thumbnail Image
    Item
    Estimation of Multiple Sinusoidal Frequencies Using Truncated Least-Squares Methods
    (1991) Hsieh, S.F.; Liu, K.J. Ray; Yao, K.; ISR
    Tufts and Kumaresan (1982) first proposed using a SVD-based method to solve the forward-backward linear prediction (FBLP) least-squares problem for resolving closely spaced frequencies of multiple sinusoids from limited amount of data samples. By imposing an excessive order in the FBLP model and then truncating small singular values to zero, this truncated SVD (TSVD) method yields a low SNR threshold and greatly suppresses spurious frequencies. However, the massive computation required by SVD makes it unsuitable for real time super-resolution applications. We propose to use truncated QR methods which are amenable to VLSI implementations, such as systolic arrays, with slightly degraded performances as compared to the TSVD method. Three truncated QR methods for sinusoidal frequency estimation will be considered: (1) truncated QR without column pivoting (TQR); (2) truncated QR with re-ordered columns (TQRR); and (3) truncated QR with column pivoting (TQRP). It is demonstrated that the benefit of the TSVD method for high frequency resolution is achievable under the truncated QR methods with much lower computational cost. Other attractive features of the proposed methods include the ease of updating which is difficult for the SVD method, and numerical stability. Thus, the TQR methods offer efficient ways for identifying sinusoids closedly clustered in frequencies under stationary and nonstationary conditions. Some results based on the truncated normal equation approach as well as on sufficient conditions for perfect truncations based on truncated QR and SVD methods are considered. Based on the FBLP model, computer simulations and comparisons are provided for different truncation methods under various SNR's. Comparisons of asymptotic performance with large data samples are also given.
  • Thumbnail Image
    Item
    Dual-State Systolic Architectures for Adaptive Filtering Using Up/Downdating RLS
    (1991) Hsieh, S.F.; Liu, K.J. Ray; Yao, K.; ISR
    We propose a dual-state systolic structure to perform joint up/down-dating operations encountered in windowed recursive least squares (RLS) estimation problems. It is derived by successively performing Givens rotations for updating and hyperbolic rotations for down-dating. Due to the data independency, a series of Givens and hyperbolic rotations can be interleaved and parallel processing can be achieved by alternatively performing updating and downdating both in time and space. This flip-flop nature of up/down-dating characterizes the feature of dual-state systolic triarray. To further reduce the complexity and increase the throughput rate, Cordic cells can be used to mimic the operations of rowbroadcasting and only one control bit is required along each row of processors. Efficient implementation to obtain optimal residuals and a transformation of the hyperbolic rotation to an algebraically equivalent orthogonal operation to provide a more stable implementation are also considered. This systolic architecture is very promising in VLSI implementation of the sliding-window recursive least squares estimations.
  • Thumbnail Image
    Item
    Systolic Implementations of Up/Down-dating Cholesky Factorization Using Vectorized Gram-Schmidt Pseudo Orthogonalization
    (1991) Hsieh, S.F.; Liu, K.J. Ray; Yao, K.; ISR
    We propose a new class of hyperbolic Gram-Schmidt methods to simultaneously update and downdate the Cholesky factor of a sample covariance matrix efficiently with applications to sliding window recursive least squares (RLS) filtering problems. Several vectorized versions of this Gram-Schmidt approach are introduced, which include conventional column-updating, modified row/column- updating, and square-root-free methods. Comparisons to the existing known methods, such as Householder transformation and Givens rotation, are also given. Upon further reformulating these algorithms, a systolic triarray structure is proposed to facilitate VLSI implementations.
  • Thumbnail Image
    Item
    Multi-phase Systolic Algorithms for Spectral Decomposition
    (1991) Liu, K.J. Ray; Yao, K.; ISR
    In this paper, we propose two multi-phase systolic algorithms to solve the spectral decomposition problem based on the QR algorithm. The spectral decomposition is one of the most computationally intensive modern signal processing operations. While the QR algorithm is well known to be an effective method to solve the eigenvalue problem, there is still no single systolic array architecture that can compute the unitary Q matrix readily and perform the QR algorithm efficiently. Previous methods using the QR algorithm had communication problems among different architectures. In this paper, two arrays, a triangular and a rectangular, are presented to implement the multi-phase algorithms. Details on these multi-phase operations of the QR algorithm as well as architectural consequences and performance evaluation are discussed in the paper. Efficient fault-tolerant schemes for these multi-phase operations are also considered.
  • Thumbnail Image
    Item
    Real-Time Algorithm-Based Fault-Tolerance for QRD Recursive Least-Squares Systolic Array: A Graceful Degradation Approach
    (1991) Liu, K.J. Ray; Yao, K.; ISR
    In this paper, we propose a new algorithm-based fault-tolerant method derived from the inherent nature of the QR lease-squares systolic algorithm. Since the residuals of different desired responses can be computed simultaneously, an artificial desired response can be designed to detect an error produced by a faulty processor. We show that if the artificial desired response is designed as some proper combinations of the input data, the output residual of the system will be zero if there is no fault. However, any occurring fault in the system will cause the residual to be non-zero and the fault can be detected in realtime. Once the fault has been detected, the system enters into the fault diagnosis phase from the concurrent error detection phase. Two methods, the flushing fault location and the checksum encoding methods, can be used to diagnose the location of the faulty row. When the faulty row is determined, this row and the associated column with the same boundary cell are eliminated by a reconfiguration operation. Then the system degrades in a graceful manner which is generally acceptable for many least-squares applications. Those eliminated processors enter into a self-checking phase, and when the transient fault condition is removed, a reconfiguration is performed to resume the normal full order operation. The analysis of error propagation and recovery latency is also considered in this paper.
  • Thumbnail Image
    Item
    Dynamic Range, Stability, and Fault-tolerant Capability of Finite-precision RLS Systolic Array Based on Givens Rotations
    (1990) Liu, K.J. Ray; Hsieh, S.F.; Yao, K.; Chiu, Ching-Te; ISR
    The QRD RLS algorithm is generally recognized as having good numerical properties under finite-precision implementation. Also, it is very suitable for VLSI implementation since it can be easily mapped onto a systolic array. However, it is still unclear how to obtain the dynamic range of the algorithm such that a wordlength can be chosen to ensure correct operations of the algorithm. In this paper, we first propose a quasi-steady state model by observing the rotation parameters generated by boundary cells will eventually reach quasi steady-state regardless of the input data statistics if l is close to one. With this model, we can obtain upper bounds of the dynamic range of processing cells. Thus, the wordlength can be obtained from upper bounds of the dynamic range to prevent overflow and to ensure correct operations of the QRD RLS algorithm. Then we reconsider the stability problem under quantization effects with more general analysis and obtain tighter bounds than given in a previous work [13]. Finally, two fault-tolerant problems, the missing error detection and the false alarm effect, the arise under finite- precision implementation are considered. Detail analysis on preventing missing error detection with a false alarm free condition is presented.
  • Thumbnail Image
    Item
    Systolic Block Householder Transformation for RLS Algorithm with Two-level Pipelined Implementation
    (1990) Liu, K.J. Ray; Hsieh, S.F.; Yao, K.; ISR
    The QRD RLS algorithm is one of the most promising RLS algorithms, due to its robust numerical stability and suitability for VLSI implementation based on a systolic array architecture. Up to now, among many techniques to implement the QR decomposition, only the Given rotation and modified Gram-Schmidt methods have been successfully applied to the development of the QRD RLS systolic array. It is well-known that Householder transformation (HT) outperforms the Givens rotation method under finite precision computations. Presently, there is no know technique to implement the HT on a systolic array architecture. In this paper, we propose a Systolic Block Householder Transformation (SBHT) approach, to implement the HT on a systolic array as well as its application to the RLS algorithm. Since the data is fetched in a block manner, vector operations are in general required for the vectorized array. However, by using a modified HT algorithm, a two-level pipelined implementation can be used to pipeline the SBHT systolic array both at the vector and word level. The throughput rate can be as fast as that of the Givens rotation method. Our approach makes the HT amenable for VLSI implementation as well as applicable to real-time high throughput applications of modern signal processing. The constrained RLS problem using the SBHT RLS systolic array is also considered in this paper.