A Parallel Sorting Algorithm With an Experimental Study

View/ Open
Date
1998-10-15Author
Helman, David R.
Bader, David A.
JaJa, Joseph
Metadata
Show full item recordAbstract
Previous schemes for sorting on general-purpose parallel machines
have had to choose between poor load balancing and irregular
communication or multiple rounds of all-to-all personalized
communication. In this paper, we introduce a novel variation on sample
sort which uses only two rounds of regular all-to-all personalized
communication in a scheme that yields very good load balancing with
virtually no overhead. This algorithm was implemented in Split-C and
run on a variety of platforms, including the Thinking Machines CM-5,
the IBM SP-2, and the Cray Research T3D. We ran our code using widely
different benchmarks to examine the dependence of our algorithm on the
input distribution. Our experimental results are consistent with the
theoretical analysis and illustrate the efficiency and scalability of
our algorithm across different platforms. In fact, it seems to
outperform all similar algorithms known to the authors on these
platforms, and its performance is invariant over the set of input
distributions unlike previous efficient algorithms. Our results also
compare favorably with those reported for the simpler ranking problem
posed by the NAS Integer Sorting (IS) Benchmark.
(Also cross-referenced as UMIACS-TR-95-102)