Chapter

Middleware 2004

Volume 3231 of the series Lecture Notes in Computer Science pp 79-98

The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations

  • Márk JelasityAffiliated withUniversity of BolognaRGAI, MTA SZTE
  • , Rachid GuerraouiAffiliated withEPFL
  • , Anne-Marie KermarrecAffiliated withINRIA
  • , Maarten van SteenAffiliated withVrije Universiteit

Abstract

In recent years, the gossip-based communication model in large-scale distributed systems has become a general paradigm with important applications which include information dissemination, aggregation, overlay topology management and synchronization. At the heart of all of these protocols lies a fundamental distributed abstraction: the peer sampling service. In short, the aim of this service is to provide every node with peers to exchange information with. Analytical studies reveal a high reliability and efficiency of gossip-based protocols, under the (often implicit) assumption that the peers to send gossip messages to are selected uniformly at random from the set of all nodes. In practice – instead of requiring all nodes to know all the peer nodes so that a random sample could be drawn – a scalable and efficient way to implement the peer sampling service is by constructing and maintaining dynamic unstructured overlays through gossiping membership information itself. This paper presents a generic framework to implement reliable and efficient peer sampling services. The framework generalizes existing approaches and makes it easy to introduce new ones. We use this framework to explore and compare several implementations of our abstraction. Through extensive experimental analysis, we show that all of them lead to different peer sampling services none of which is uniformly random. This clearly renders traditional theoretical approaches invalid, when the underlying peer sampling service is based on a gossip-based scheme. Our observations also help explain important differences between design choices of peer sampling algorithms, and how these impact the reliability of the corresponding service.