Parallelization of the SSCA#3 Benchmark on the RAW Processor

dc.contributor.authorWu, Meng-Ju
dc.contributor.authorYeung, Donald
dc.date.accessioned2006-11-06T19:04:17Z
dc.date.available2006-11-06T19:04:17Z
dc.date.issued2006-11-06
dc.description.abstractThe MIT Raw machine provides a point-to-point interconnection network for transferring register values between tiles. The programmer schedules the network communication for each tile by himself/herself and guarantees the correctness. It is not easy to parallelize benchmarks by hand for all possible tile configurations on the Raw processor. To overcome this problem, we develop a communication library and a switch code generator to create the switch code for each tile automatically. We implement our techniques for the SSCA#3 (SAR Sensor Processing, Knowledge Formation) benchmark, and evaluate the parallelism on a physical Raw processor. The experimental results show the SSCA#3 benchmark has dense matrix operations with abundant parallelism. Using 16 tiles, the ’SAR image formation’ procedure achieves a speedup of 13.86, and the speedup of the ’object detection’ procedure is 9.98.en
dc.format.extent334777 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/3987
dc.language.isoen_USen
dc.relation.ispartofseriesUMIACSen
dc.relation.ispartofseriesUMIACS-TR-2006-42en
dc.titleParallelization of the SSCA#3 Benchmark on the RAW Processoren
dc.typeTechnical Reporten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
UMIACS-TR-2006-42.pdf
Size:
326.93 KB
Format:
Adobe Portable Document Format