Automated Techniques for Designing Embedded Signal Processors on
Distributed Platforms
Automated Techniques for Designing Embedded Signal Processors on
Distributed Platforms
Loading...
Files
Publication or External Link
Date
1998-11-04
Authors
Kang, Dong-In
Gerber, Richard
Golubchik, Leana
Advisor
Citation
DRUM DOI
Abstract
In this paper, we present a performance-based technique to
help synthesize high-bandwidth radar processors on commodity
platforms. This problem is innately complex, for a number of reasons.
Contemporary radars are very compute-intensive: they have
high pulse rates, and they sample a large amount of range readings
at each pulse. Indeed, modern radar processors can require
CPU loads of in high-gigaflop to tera-flop ranges,
performance which is only achieved by exploiting the radar's inherent
data parallelism. Next-generation radars
are slated to operate on scalable clusters
of commodity systems.
Throughput is only one problem. Since
radars are usually embedded within larger real-time applications,
they also must adhere to latency (or deadline) constraints.
Building an embedded radar processor on a network
of workstations (or a NOW) involves partitioning load in a
balanced fashion, accounting for stochastic
effects injected on all software-based systems, synthesizing
runtime parameters for the on-line schedulers and drivers,
and meeting the latency and throughput constraints.
In this paper, we show how performance analysis can be used as
an effective tool in the design loop; specifically, our
method uses analytic approximation techniques to help
synthesize efficient designs for radar processing systems.
In our method, the signal-processor's topology is represented via a simple
flow-graph abstraction, and the per-unit load requirements are modeled
stochastically, to account for second-order effects like cache memory
behavior, DMA interference, pipeline stalls, etc. Our design algorithm
accepts the following inputs: (a)~the system topology, including
the thread-to-CPU mapping, where multi-threading is assumed to be used;
(b) the per-task load models; and (c) the required pulse rate and latency
constraints. As output, it produces the proportion of load to allocate
to each task, set at manageable time resolutions for the local
schedulers; an optimal service interval over which all
load proportions should be guaranteed; an optimal sampling
frequency; and some reconfiguration schemes to accommodate
single-node failures. Internally, the design algorithms use analytic
approximations to quickly estimate output rates and propagation delays for
candidate solutions. When the system is synthesized, its results are
checked via a simulation model, which removes many of the analytic
approximations.
We show how our system synthesizes a real-time synthetic aperture
radar, under a variety of loading conditions.
Also cross-referenced as UMIACS TR # 98-57