Algorithms for data placement, reconfiguration and monitoring in storage networks

Loading...
Thumbnail Image

Files

umi-umd-4838.pdf (1.16 MB)
No. of downloads: 2603

Publication or External Link

Date

2007-08-27

Citation

DRUM DOI

Abstract

In this thesis we address three problems related to self-management of storage networks - data placement, data reconfiguration and data monitoring. Examples of such storage networks include centrally managed systems like Storage Area Networks and Network Attached Storage devices, or even highly distributed systems like a P2P network or a Sensor Network.

One of the crucial functions of a storage system is that of deciding the placement of data within the system. This data placement is dependent on the demand pattern for the data and subject to constraints of the storage system. For instance, if a particular data item is very popular the storage system might want to host it on a disk with high bandwidth or make multiple copies of the item. We present new results for some of these data placement problems.

As the demand pattern changes over time, the storage system will have to modify its placement accordingly. Such a modification in placement will typically involve movement of data items from one set of disks to another or changing the number of copies of a data item in the system. For such a modification to be effective, it should be computed and applied quickly since the system is running inefficiently during this reconfiguration. We propose new schemes to reconfigure the data placement to deal with changing demand.

To re-compute data placement periodically and to reconfigure the data placement, we need to continuously track of the demand distribution in the storage system and also be able to answer aggregate queries about the demand distribution. The data monitoring portion of the thesis deals with such problems that arise in the context of distributed data management applications. A monitoring system for such a scenario would need to process large amounts of data from a widely distributed set of data sources. The thesis presents new schemes that improve communication-efficiency of existing methods that address these problems.

Notes

Rights