Replication Techniques for Peer-to-Peer Networks
Keleher, Peter J
MetadataShow full item record
A close examination of presently deployed peer-to-peer networks (P2P) and existing proposals reveals several trends regarding data management. First, only a small percentage of the queried data is actually retrieved by users. Second, the vast majority of data exported by participants in P2P networks is unaltered after creation. Third, membership changes in P2P networks (users/sites leaving or joining the network) occur on a scale not seen before in distributed systems. We address the challenges posed by the above observations using two data replication techniques. First, we define a routing protocol and replication model for hierarchical namespaces that adaptively replicates routing information in response to fluctuating load incurred by the lookup and search procedures. Second, we define a replica control protocol (d-spaces) based on quorum consensus in multi-dimensional spaces that is well suited for scenarios where reads are the dominant type of access to data. We extend the basic d-space protocol with a quorum reconfiguration mechanism which uses local connectivity and cached quorum groups instead of global views to achieve seemingly contradicting goals: improved fault tolerance through local reconfiguration policies, and good global latencies. Our main contributions are as follows: We show that hierarchical bottlenecks can be eliminated, with the resulting system behaving like a true symmetrical P2P network. Temporal locality in the stream of requests (hot-spots) is managed by replicating on demand popular portions of the namespace. We argue that non-virtualized object namespaces maintain desirable properties that are lost through virtualization. We present a token-based method of resolving complex search queries through hierarchical query decomposition and result recomposition. The method allows the initiator to ascertain query completion, and offers a broad range of trade-offs between network bandwidth consumption and the number of connections required to complete the search. We show that d-space quorums have optimal communication complexity. We define the Read-Few Write-Many replica control protocol and argue that the quality of trade-off between read efficiency and update availability is not matched by existing replica control protocols. Finally, we present a quorum reconfiguration mechanism that allows membership changes without requiring global reconfiguration phases.