Replication Techniques for Peer-to-Peer Networks

Thumbnail Image


dissertation.pdf (791.46 KB)
No. of downloads: 1150

Publication or External Link






A close examination of presently deployed peer-to-peer networks

(P2P) and existing proposals reveals several trends regarding data management.

First, only a small percentage of the queried data is actually retrieved by

users. Second, the vast majority of data exported by participants in P2P

networks is unaltered after creation. Third, membership changes in P2P networks

(users/sites leaving or joining the network) occur on a scale not seen before in

distributed systems.

We address the challenges posed by the above observations using two data

replication techniques. First, we define a routing protocol and replication

model for hierarchical namespaces that adaptively replicates routing information

in response to fluctuating load incurred by the lookup and search procedures.

Second, we define a replica control protocol (d-spaces) based on quorum

consensus in multi-dimensional spaces that is well suited for scenarios where

reads are the dominant type of access to data. We extend the basic d-space

protocol with a quorum reconfiguration mechanism which uses local connectivity

and cached quorum groups instead of global views to achieve seemingly

contradicting goals: improved fault tolerance through local reconfiguration

policies, and good global latencies.

Our main contributions are as follows: We show that hierarchical bottlenecks can

be eliminated, with the resulting system behaving like a true symmetrical P2P

network. Temporal locality in the stream of requests (hot-spots) is managed by

replicating on demand popular portions of the namespace. We argue that

non-virtualized object namespaces maintain desirable properties that are lost

through virtualization. We present a token-based method of resolving complex

search queries through hierarchical query decomposition and result

recomposition. The method allows the initiator to ascertain query completion,

and offers a broad range of trade-offs between network bandwidth consumption and

the number of connections required to complete the search. We show that d-space

quorums have optimal communication complexity. We define the Read-Few Write-Many

replica control protocol and argue that the quality of trade-off between read

efficiency and update availability is not matched by existing replica control

protocols. Finally, we present a quorum reconfiguration mechanism that allows

membership changes without requiring global reconfiguration phases.