The Dwarf Data Cube Eliminates the Highy Dimensionality Curse

dc.contributor.authorSismanis, Yannisen_US
dc.contributor.authorRoussopoulos, Nicken_US
dc.date.accessioned2004-05-31T23:35:07Z
dc.date.available2004-05-31T23:35:07Z
dc.date.created2003-12en_US
dc.date.issued2003-12-18en_US
dc.description.abstractThe data cube operator encapsulates all possible groupings of a data set and has proved to be an invaluable tool in analyzing vast amounts of data. However its apparent exponential complexity has significantly limited its applicability to low dimensional datasets. Recently the idea of the dwarf data cube model was introduced, and showed that high-dimensional ``dwarf data cubes'' are orders of magnitudes smaller in size than the original data cubes even when they calculate and store every possible aggregation with 100\% precision. In this paper we present a surprising analytical result proving that the size of dwarf cubes grows polynomially with the dimensionality of the data set and, therefore, a full data cube at 100% precision is not inherently cursed by high dimensionality. This striking result of polynomial complexity reformulates the context of cube management and redefines most of the problems associated with data-warehousing and On-Line Analytical Processing. We also develop an efficient algorithm for estimating the size of dwarf data cubes before actually computing them. Finally, we complement our analytical approach with an experimental evaluation using real and synthetic data sets, and demonstrate our results. UMIACS-TR-2003-120en_US
dc.format.extent252169 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/1333
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4552en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2003-120en_US
dc.titleThe Dwarf Data Cube Eliminates the Highy Dimensionality Curseen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-4552.pdf
Size:
246.26 KB
Format:
Adobe Portable Document Format