SYSTEMATIC EXPLORATION OF TRADE-OFFS BETWEEN APPLICATION THROUGHPUT AND HARDWARE RESOURCE REQUIREMENTS IN DSP SYSTEMS
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Dataflow has been used extensively as an efficient model-of-computation
to analyze performance and resource requirements in implementing DSP
algorithms on various target architectures. Although various
software synthesis techniques have been widely studied in recent years,
there is a distinct lack of efficient synthesis techniques
in the literature for systematically mapping dataflow models
into efficient hardware implementations. In this thesis,
we explore three different aspects
that contribute to the development of a
powerful dataflow-based hardware synthesis framework:
- Systematic generation of 1D/2D FFT implementation on field programmable
gate arrays (FPGAs). The fast Fourier transform (FFT) is one of the most
widely-used and important signal processing functions. However, FFT computation
generally becomes a major bottleneck for overall system performance due to its
high computational requirements. We propose a systematic approach for
synthesizing FPGA implementations of one- and two-dimensional (1D and 2D) FFT
computations, and rigorously exploring trade-offs between cost (in terms of
FPGA resource requirements) and performance (in terms of throughput). Our
approach provides an efficient hardware synthesis framework that can be
customized to specific design constraints. In our FFT synthesis approach, we
apply two orthogonal techniques in FPGA implementation to realize
data-parallelism and parallel processing in FFT computation, respectively.
These techniques can be applied to various 1D FFT algorithms, including Radix-2
and Radix-4 algorithms, and extended naturally and efficiently to 2D FFT
implementation.
- Buffer optimization under self-timed execution. Self-timed execution is
known to provide the maximum achievable throughput when mapping DSP dataflow
graphs into hardware under certain technical constraints.
Throughput-constrained buffer minimization under self-timed execution
is a key question in efficient hardware synthesis for practical design
scenarios. Previous approaches to this problem have suffered from high worst
case complexity or loose buffer bounds, which lead to inefficient resource
utilization. In this thesis, we integrate a novel constraint into traditional
self-timed execution to obtain a modified form of self-timed execution, which
we call MSTE (Modified Self-Timed Execution). We show that MSTE greatly
improves the efficiency with which we can accurately analyze and optimize
hardware configurations of dataflow graphs, and furthermore, the additional
execution constraints imposed in MSTE result in relatively minor
performance overhead. Based on MSTE, we explore novel methods for self-timed
analysis and associated techniques for buffer optimization subject to given
throughput constraints.
- Hardware synthesis technique for parameterized dataflow model.
Parameterized dataflow modeling approaches allow for dynamic capabilities
without excessively compromising the key properties of the existing static
dataflow model --- compile-time predictability and potential for rigorous
optimizations. We develop a novel PSDF-based FPGA architecture
framework using National Instrument's LabVIEW FPGA, a recently-introduced
commercial platform for reconfigurable hardware implementation.
This framework develops novel connections among model-based
DSP system design, FPGA implementation, and next generation wireless
communication systems.