T2: A Customizable Parallel Database For Multi-dimensional Data

dc.contributor.authorChang, Chialinen_US
dc.contributor.authorAcharya, Anuragen_US
dc.contributor.authorSussman, Alanen_US
dc.contributor.authorSaltz, Joelen_US
dc.date.accessioned2004-05-31T22:49:39Z
dc.date.available2004-05-31T22:49:39Z
dc.date.created1998-01en_US
dc.date.issued1998-10-15en_US
dc.description.abstractAs computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important part in many domains of scientific research. Several database research groups and vendors have developed object-relational database systems to provide some support for managing and/or visualizing multi-dimensional datasets. These systems, however, provide little or no support for analyzing or processing these datasets -- the assumption is that this is too application-specific to warrant common support. As a result, applications that process these datasets are analyzing large volumes of multi-dimensional datasets play an increasingly important part in many domains of scientific research. Several database research groups and vendors have developed object-relational database systems to provide some support for managing and/or visualizing multi-dimensional datasets. These systems, however, provide little or no support for analyzing or processing these datasets -- the assumption is that this is too application-specific to warrant common support. As a result, applications that process these datasets are usually decoupled from data storage and management, resulting in inefficiency due to copying and loss of locality. Furthermore, every application developer has to implement complex support for managing and scheduling the processing. Our study of a large set of scientific applications over the past three years indicates that the processing for such datasets is often highly stylized and shares several important characteristics. Usually, both the input dataset as well as the result being computed have underlying multi-dimensional grids. The basic processing step usually consists of transforming individual input items, mapping the transformed items to the output grid and computing output items by aggregating, in some way, all the transformed input items mapped to the corresponding grid point. In this paper, we present the design of T2, a customizable parallel database that integrates storage, retrieval and processing of multi-dimensional datasets. T2 provides support for common operations including index generation, data retrieval, memory management, scheduling of processing across a parallel machine and user interaction. It achieves its primary advantage from the ability to seamlessly integrate data retrieval and processing for a wide variety of applications and from the ability to maintain and jointly process multiple datasets with different underlying grids. (Also cross-referenced as UMIACS-TR-98-04)en_US
dc.format.extent219759 bytes
dc.format.mimetypeapplication/postscript
dc.identifier.urihttp://hdl.handle.net/1903/935
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-3867en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-98-04en_US
dc.titleT2: A Customizable Parallel Database For Multi-dimensional Dataen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
CS-TR-3867.ps
Size:
214.61 KB
Format:
Postscript Files
Loading...
Thumbnail Image
Name:
CS-TR-3867.pdf
Size:
217.45 KB
Format:
Adobe Portable Document Format
Description:
Auto-generated copy of CS-TR-3867.ps