Empirical Speedup Study of Truly Parallel Data Compression

dc.contributor.authorEdwards, James A.
dc.contributor.authorVishkin, Uzi
dc.date.accessioned2013-05-03T12:19:32Z
dc.date.available2013-05-03T12:19:32Z
dc.date.issued2013-04-20
dc.description.abstractWe present an empirical study of novel work-optimal parallel algorithms for Burrows-Wheeler compression and decompression of strings over a constant alphabet. To validate these theoretical algorithms, we implement them on the experimental XMT computing platform developed especially for supporting parallel algorithms at the University of Maryland. We show speedups of up to 25x for compression, and 13x for decompression, versus bzip2, the de facto standard implementation of Burrows-Wheeler compression. Unlike existing approaches, which assign an entire (e.g., 900KB) block to a processor that processes the block serially, our approach is “truly parallel” as it processes in parallel the entire input. Besides the theoretical interest in solving the “right” problem, the importance of data compression speed for small inputs even at great expense of quality (compressed size of data) is demonstrated by the introduction of Google’s Snappy for MapReduce. Perhaps surprisingly, we show feasibility of holding on to quality, while even beating Snappy on speed. In turn, this work adds new evidence in support of the XMT/PRAM thesis: that an XMT-like many-core hardware/ software platform may be necessary for enabling general-purpose parallel computing. Comparison of our results to recently published work suggests 70x improvement over what current commercial parallel hardware can achieve.en_US
dc.description.sponsorshipNSF grants CCF-0811504 and CNS1161857en_US
dc.identifier.urihttp://hdl.handle.net/1903/13890
dc.language.isoen_USen_US
dc.relation.isAvailableAtCollege of Computer, Mathematical & Natural Sciencesen_us
dc.relation.isAvailableAtComputer Scienceen_us
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_us
dc.relation.isAvailableAtUniversity of Maryland (College Park, MD)en_us
dc.subjectparallel computingen_US
dc.subjectPRAMen_US
dc.subjectSpeedupsen_US
dc.subjectBurrows-Wheeleren_US
dc.subjectlossless compressionen_US
dc.titleEmpirical Speedup Study of Truly Parallel Data Compressionen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
bw_compression_empirical_TR_2May2013.pdf
Size:
314.97 KB
Format:
Adobe Portable Document Format