On a Graph-theoretic Approach to Scheduling Large-scale Data Transfers
Files
Publication or External Link
Date
Advisor
Citation
DRUM DOI
Abstract
In this paper we consider the problem of moving a large amount of
data from several different hosts to a single destination in a wide-area
network. Often, due to congestion conditions, the choice of paths
by the network may be poor. By choosing an indirect
route at the application level,
we may be able to obtain substantially higher performance
in moving data through the network. We formulate this data transfer
(collection)
problem as a network flow problem. We show that by using a min-cost
flow algorithm on an appropriately defined
time-expanded (network) graph, we can obtain
a data transfer schedule. We show that such schedules can be an order of magnitude better than schedules obtained by transferring data directly
from each host to the destination. In fact, this holds, even though we
make no assumptions about knowledge of the topology of the network or
the capacity available on individual links of the network. We simply
use end-to-end type information and compute a schedule for transferring
the data.
Finally, we also study the shortcomings of this approach in terms
of the gap between the network flow formulation and data transfers in
a wide-area network.
Also UMIACS-TR-2002-04