Interest-Based Self-Organizing Peer-to-Peer Networks
(Research Seminar, September 9th, 2004)

Michael Smith
Carnegie-Mellon University

Abstract
Improving the information retrieval (IR) performance of P2P networks is an important and challenging problem. Recently, the computer science literature has tried to address this problem by improving the efficiency of search algorithms. However, little attention has been paid to improving performance through the design of incentives for encouraging users to “share” content and, mechanisms for enabling peers to form “communities” based on shared interests.

Our work draws on the club goods economics literature and the computer science IR literature to propose a next generation file sharing architecture addressing these issues. Using the popular Gnutella 0.6 architecture as context, we conceptualize a Gnutella ultrapeer and its local network of leaf nodes as a “club” (in economic terms). We specify an IR-based utility model for a peer to determine which clubs to join, for a club to manage its membership, and for a club to determine to which other clubs they should connect.

We simulate the performance of our model using a unique real-world dataset collected from the Gnutella 0.6 network. These simulations show that our club model accomplishes both performance goals. First, peers are self-organized into communities of interest — in our club model peers are 85% more likely to be able to obtain content from their local club than they are in the current Gnutella 0.6 architecture. Second, peers have increased incentives to share content — our model shows that peers who share can increase their recall performance by nearly five times over the performance offered to free-riders. We also show that the benefits provided by our club model outweigh the added protocol overhead imposed on the network, that our results are stronger in larger simulated networks, and that our results are robust to dynamic networks with typical levels of user entry and exit.