Empirical Speedup Study of Truly Parallel Data Compression
Empirical Speedup Study of Truly Parallel Data Compression
Files
Publication or External Link
Date
2013-04-20
Authors
Edwards, James A.
Vishkin, Uzi
Advisor
Citation
DRUM DOI
Abstract
We present an empirical study of novel work-optimal parallel
algorithms for Burrows-Wheeler compression and decompression
of strings over a constant alphabet. To validate
these theoretical algorithms, we implement them on the experimental
XMT computing platform developed especially
for supporting parallel algorithms at the University of Maryland.
We show speedups of up to 25x for compression, and
13x for decompression, versus bzip2, the de facto standard
implementation of Burrows-Wheeler compression. Unlike
existing approaches, which assign an entire (e.g., 900KB)
block to a processor that processes the block serially, our
approach is “truly parallel” as it processes in parallel the
entire input. Besides the theoretical interest in solving the
“right” problem, the importance of data compression speed
for small inputs even at great expense of quality (compressed
size of data) is demonstrated by the introduction of Google’s
Snappy for MapReduce. Perhaps surprisingly, we show feasibility
of holding on to quality, while even beating Snappy
on speed.
In turn, this work adds new evidence in support of the
XMT/PRAM thesis: that an XMT-like many-core hardware/
software platform may be necessary for enabling general-purpose
parallel computing. Comparison of our results to recently
published work suggests 70x improvement over what
current commercial parallel hardware can achieve.