SYSTEMATIC EXPLORATION OF TRADE-OFFS BETWEEN APPLICATION THROUGHPUT AND HARDWARE RESOURCE REQUIREMENTS IN DSP SYSTEMS

Thumbnail Image
Files
Publication or External Link
Date
2010
Authors
Kee, Hojin
Advisor
Bhattacharyya, Shuvra S.
Citation
DRUM DOI
Abstract
Dataflow has been used extensively as an efficient model-of-computation to analyze performance and resource requirements in implementing DSP algorithms on various target architectures. Although various software synthesis techniques have been widely studied in recent years, there is a distinct lack of efficient synthesis techniques in the literature for systematically mapping dataflow models into efficient hardware implementations. In this thesis, we explore three different aspects that contribute to the development of a powerful dataflow-based hardware synthesis framework: 1. Systematic generation of 1D/2D FFT implementation on field programmable gate arrays (FPGAs). The fast Fourier transform (FFT) is one of the most widely-used and important signal processing functions. However, FFT computation generally becomes a major bottleneck for overall system performance due to its high computational requirements. We propose a systematic approach for synthesizing FPGA implementations of one- and two-dimensional (1D and 2D) FFT computations, and rigorously exploring trade-offs between cost (in terms of FPGA resource requirements) and performance (in terms of throughput). Our approach provides an efficient hardware synthesis framework that can be customized to specific design constraints. In our FFT synthesis approach, we apply two orthogonal techniques in FPGA implementation to realize data-parallelism and parallel processing in FFT computation, respectively. These techniques can be applied to various 1D FFT algorithms, including Radix-2 and Radix-4 algorithms, and extended naturally and efficiently to 2D FFT implementation. 2. Buffer optimization under self-timed execution. Self-timed execution is known to provide the maximum achievable throughput when mapping DSP dataflow graphs into hardware under certain technical constraints. Throughput-constrained buffer minimization under self-timed execution is a key question in efficient hardware synthesis for practical design scenarios. Previous approaches to this problem have suffered from high worst case complexity or loose buffer bounds, which lead to inefficient resource utilization. In this thesis, we integrate a novel constraint into traditional self-timed execution to obtain a modified form of self-timed execution, which we call MSTE (Modified Self-Timed Execution). We show that MSTE greatly improves the efficiency with which we can accurately analyze and optimize hardware configurations of dataflow graphs, and furthermore, the additional execution constraints imposed in MSTE result in relatively minor performance overhead. Based on MSTE, we explore novel methods for self-timed analysis and associated techniques for buffer optimization subject to given throughput constraints. 3. Hardware synthesis technique for parameterized dataflow model. Parameterized dataflow modeling approaches allow for dynamic capabilities without excessively compromising the key properties of the existing static dataflow model --- compile-time predictability and potential for rigorous optimizations. We develop a novel PSDF-based FPGA architecture framework using National Instrument's LabVIEW FPGA, a recently-introduced commercial platform for reconfigurable hardware implementation. This framework develops novel connections among model-based DSP system design, FPGA implementation, and next generation wireless communication systems.
Notes
Rights