University of Maryland DRUM  
University of Maryland Digital Repository at the University of Maryland

DRUM >
Theses and Dissertations from UMD >
UMD Theses and Dissertations >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1903/2685

Title: Algorithms for Data Migration
Authors: Kim, Yoo-Ah
Advisors: Khuller, Samir
Department/Program: Computer Science
Type: Dissertation
Sponsors: Digital Repository at the University of Maryland
University of Maryland (College Park, Md.)
Keywords: Computer Science (0984)
Approximation algorithms; Data Migration; Broadcasting
Issue Date: 15-Jun-2005
Abstract: This thesis is concerned with the problem related to data storage and management. A large storage server consists of several hundreds of disks. To balance the load across disks, the system computes data layouts that are typically adjusted according to the workload. As workloads change over time, the system recomputes the data layout, and rearranges the data items according to the new layout. We identify the problem of computing an efficient data migration plan that converts an initial layout to a target layout. We define the data migration problem as follows: for each item, there are a set of disks that have the item (sources) and a set of disks that want to receive the item (destinations). We want to migrate the data items from the sources to destinations. The crucial constraint is that each disk can participate in only one transfer at a time. The most common objective has been to minimize the makespan, which is the time when we finish all the migrations. The problem is NP-hard, and we develop polynomial time algorithms with constant factor approximation guarantees and several other heuristic algorithms. We present the performance evaluation of the different methods through an experimental study. We also consider the data migration problem to minimize the sum of completion times over all migration jobs or storage devices. Minimizing the sum of completion times of jobs is one of the most common objectives in scheduling literature. On the other hand, since a storage device may run inefficiently while the device is involved in migrations, another interesting objective is to minimize the sum of completion times over all storage devices. We present hardness results and constant factor approximation algorithms for these objectives. In addition, we consider the case when we have a heterogeneous collection of machines. We assume that heterogeneity is modeled by a non-uniform speed of the sending machine. For the basic problem of multicasting and broadcasting in the model, we show that Fastest Node First scheme gives a approximation ratio of 1.5 for minimizing the makespan. We also prove that there is a polynomial time approximation scheme.
URI: http://hdl.handle.net/1903/2685
Appears in Collections:UMD Theses and Dissertations
Computer Science Theses and Dissertations

Files in This Item:

File Description SizeFormatNo. of Downloads
umi-umd-2607.pdf729.22 kBAdobe PDF1768View/Open

All items in DRUM are protected by copyright, with all rights reserved.

 

DRUM is brought to you by the University of Maryland Libraries
University of Maryland, College Park, MD 20742-7011 (301)314-1328.
Please send us your comments. -
All Contents