DataCutter and A Client Interface for the Storage Resource Broker with
DataCutter Services
DataCutter and A Client Interface for the Storage Resource Broker with
DataCutter Services
Loading...
Files
Publication or External Link
Date
2000-05-13
Authors
Kurc, Tahsin
Beynon, Michael
Sussman, Alan
Saltz, Joel
Advisor
Citation
DRUM DOI
Abstract
The continuing increase in the capabilities of high performance
computers and continued decreases in the cost of secondary and
tertiary storage systems is making it increasingly feasible to
generate and archive very large (e.g. petabyte and larger)
datasets. Applications are also increasingly likely to make use of
archived data obtained by different types of sensors. Such sensors
include imaging devices deployed on satellites and aircraft,
microscopy related imagery and radiology related imagery.
Simulation or sensor datasets generated or acquired by one group may
need to be accessed over a wide-area network by other groups.
Datasets frequently describe data associated with collections of very
large structured or unstructured grids where each grid point is
associated with several variables. Applications frequently need only
to obtain portions of a dataset. Required data may correspond to a
particular region in a multidimensional space. The application may
need to access all data associated in a multidimensional region or it
may need only certain variable values at a subsampled set of spatial
locations. In addition, in some cases, applications may require data
products obtained by aggregating data in one way or another. For
instance, a user might require time or space averaged data.
This document describes the design of a middleware infrastructure,
called DataCutter, that enables subsetting and user-defined
filtering of multi-dimensional datasets stored in archival storage
systems across a wide-area network. We also describe a client API for
Storage Resource Broker (SRB) clients, which allows SRB clients to
carry out subsetting and filtering of datasets stored through the
SRB. This API uses a prototype implementation of the DataCutter
indexing and filtering services.
(Also cross-referenced as UMIACS-TR-2000-26)