Automated Techniques for Designing Embedded Signal Processors on Distributed Platforms

dc.contributor.authorKang, Dong-Inen_US
dc.contributor.authorGerber, Richarden_US
dc.contributor.authorGolubchik, Leanaen_US
dc.date.accessioned2004-05-31T21:07:59Z
dc.date.available2004-05-31T21:07:59Z
dc.date.created1998-10en_US
dc.date.issued1998-11-04en_US
dc.description.abstractIn this paper, we present a performance-based technique to help synthesize high-bandwidth radar processors on commodity platforms. This problem is innately complex, for a number of reasons. Contemporary radars are very compute-intensive: they have high pulse rates, and they sample a large amount of range readings at each pulse. Indeed, modern radar processors can require CPU loads of in high-gigaflop to tera-flop ranges, performance which is only achieved by exploiting the radar's inherent data parallelism. Next-generation radars are slated to operate on scalable clusters of commodity systems. Throughput is only one problem. Since radars are usually embedded within larger real-time applications, they also must adhere to latency (or deadline) constraints. Building an embedded radar processor on a network of workstations (or a NOW) involves partitioning load in a balanced fashion, accounting for stochastic effects injected on all software-based systems, synthesizing runtime parameters for the on-line schedulers and drivers, and meeting the latency and throughput constraints. In this paper, we show how performance analysis can be used as an effective tool in the design loop; specifically, our method uses analytic approximation techniques to help synthesize efficient designs for radar processing systems. In our method, the signal-processor's topology is represented via a simple flow-graph abstraction, and the per-unit load requirements are modeled stochastically, to account for second-order effects like cache memory behavior, DMA interference, pipeline stalls, etc. Our design algorithm accepts the following inputs: (a)~the system topology, including the thread-to-CPU mapping, where multi-threading is assumed to be used; (b) the per-task load models; and (c) the required pulse rate and latency constraints. As output, it produces the proportion of load to allocate to each task, set at manageable time resolutions for the local schedulers; an optimal service interval over which all load proportions should be guaranteed; an optimal sampling frequency; and some reconfiguration schemes to accommodate single-node failures. Internally, the design algorithms use analytic approximations to quickly estimate output rates and propagation delays for candidate solutions. When the system is synthesized, its results are checked via a simulation model, which removes many of the analytic approximations. We show how our system synthesizes a real-time synthetic aperture radar, under a variety of loading conditions. Also cross-referenced as UMIACS TR # 98-57en_US
dc.format.extent480098 bytes
dc.format.mimetypeapplication/postscript
dc.identifier.urihttp://hdl.handle.net/1903/499
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtComputer Science Department Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-3944en_US
dc.titleAutomated Techniques for Designing Embedded Signal Processors on Distributed Platformsen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
CS-TR-3944.ps
Size:
468.85 KB
Format:
Postscript Files
Loading...
Thumbnail Image
Name:
CS-TR-3944.pdf
Size:
390.27 KB
Format:
Adobe Portable Document Format
Description:
Auto-generated copy of CS-TR-3944.ps