Replication Techniques for Peer-to-Peer Networks
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
A close examination of presently deployed peer-to-peer networks
(P2P) and existing proposals reveals several trends regarding data management.
First, only a small percentage of the queried data is actually retrieved by
users. Second, the vast majority of data exported by participants in P2P
networks is unaltered after creation. Third, membership changes in P2P networks
(users/sites leaving or joining the network) occur on a scale not seen before in
distributed systems.
We address the challenges posed by the above observations using two data
replication techniques. First, we define a routing protocol and replication
model for hierarchical namespaces that adaptively replicates routing information
in response to fluctuating load incurred by the lookup and search procedures.
Second, we define a replica control protocol (d-spaces) based on quorum
consensus in multi-dimensional spaces that is well suited for scenarios where
reads are the dominant type of access to data. We extend the basic d-space
protocol with a quorum reconfiguration mechanism which uses local connectivity
and cached quorum groups instead of global views to achieve seemingly
contradicting goals: improved fault tolerance through local reconfiguration
policies, and good global latencies.
Our main contributions are as follows: We show that hierarchical bottlenecks can
be eliminated, with the resulting system behaving like a true symmetrical P2P
network. Temporal locality in the stream of requests (hot-spots) is managed by
replicating on demand popular portions of the namespace. We argue that
non-virtualized object namespaces maintain desirable properties that are lost
through virtualization. We present a token-based method of resolving complex
search queries through hierarchical query decomposition and result
recomposition. The method allows the initiator to ascertain query completion,
and offers a broad range of trade-offs between network bandwidth consumption and
the number of connections required to complete the search. We show that d-space
quorums have optimal communication complexity. We define the Read-Few Write-Many
replica control protocol and argue that the quality of trade-off between read
efficiency and update availability is not matched by existing replica control
protocols. Finally, we present a quorum reconfiguration mechanism that allows
membership changes without requiring global reconfiguration phases.