Mobile Networks and Applications

, Volume 17, Issue 3, pp 378–394 | Cite as

Still Alive: Extending Keep-Alive Intervals in P2P Overlay Networks

  • Richard PriceEmail author
  • Peter Tiňo
  • Georgios Theodoropoulos


Detecting node failures within Peer-to-Peer networks is an inherent trade-off between timely detection and consuming bandwidth on network maintenance. In the absence of user-driven messages, the majority of P2P networks rely upon the exchange of periodic keep-alive messages to maintain connections and network topology. We investigate three novel algorithms which prioritise keep-alive messages to nodes that are more likely to have failed. In doing so, these algorithms significantly reduce the expected delay between failures occurring and their subsequent detection in comparison to the standard approach, whilst consuming similar levels of bandwidth. Our algorithms build upon several studies that have shown that older peers are more likely to remain in the network than their short-lived counterparts. Each of our algorithms increase the interval between successive keep-alive messages as peers age in the system, based upon the distribution of peer session times and the current age of peers. We extensively describe the details of each algorithm, before comparing them to the standard periodic approach using simulations based upon measured network data. Furthermore, we show that these algorithms are complimentary to existing gossip-based mechanisms and investigate alternate methods of ascertaining a node’s age so that our algorithms can be robustly deployed in untrustworthy environments.


Peer-to-Peer networks P2P distributed systems failure detection computer networks 


  1. 1.
    Gnutella (2010) Gnutella protocol development wiki [Online].
  2. 2.
    Cohen B (2003) Incentives build robustness in BitTorrent. In: Workshop on economics of Peer-to-Peer systems, vol 6Google Scholar
  3. 3.
    Stoica I, Morris R, Karger D, Kaashoek F, Balakrishnan H (2001) Chord: a scalable Peer-To-Peer lookup service for internet applications. In: Proceedings of the 2001 ACM SIGCOMM conference, pp 149–160. [Online]. Available:
  4. 4.
    Rowstron AIT, Druschel P (2001) Pastry: scalable, decentralized object location, and routing for large-scale Peer-to-Peer systems. In: Proc. of the IFIP/ACM int. conf. on distributed systems platforms. Springer, Heidelberg, pp 329–350Google Scholar
  5. 5.
    Rhea S, Geels D, Roscoe T, Kubiatowicz J (2004) Handling churn in a dht. In: ATEC ’04: Proceedings of the annual conference on USENIX annual technical conference. USENIX Association, Berkeley, p 10Google Scholar
  6. 6.
    Alima L, El-Ansary S, Brand P, Haridi S (2003) DKS (N, k, f): a family of low communication, scalable and fault-tolerant infrastructures for P2P applications. In: Proceedings of the 3st international symposium on cluster computing and the grid. IEEE Computer Society, Washington, DC, p 344Google Scholar
  7. 7.
    Krishnamurthy S, El-Ansary S, Aurell E, Haridi S (2008) Comparing maintenance strategies for overlays. In: Proceedings of the 16th Euromicro conference on parallel, distributed and network-based processing (PDP 2008). IEEE Computer Society, Washington, DC, pp 473–482CrossRefGoogle Scholar
  8. 8.
    Ghinita G, Teo YM (2006) An adaptive stabilization framework for distributed hash tables, p 10Google Scholar
  9. 9.
    Castro M, Costa M, Rowstron A (2004) Performance and dependability of structured Peer-to-Peer overlays. In: DSN ’04: Proceedings of the 2004 international conference on dependable systems and networks. IEEE Computer Society, Washington, DC, p 9Google Scholar
  10. 10.
    Zhuang S, Geels D, Stoica I, Katz R (2005) On failure detection algorithms in overlay networks. In: Proceedings IEEE INFOCOM 2005. 24th annual joint conference of the IEEE Computer and Communications Societies, vol 3Google Scholar
  11. 11.
    Dedinski I, Hofmann A, Sick B (2007) Cooperative keep-alives: an efficient outage detection algorithm for P2P overlay networks. In: P2P ’07: Proceedings of the seventh IEEE international conference on Peer-to-Peer computing. IEEE Computer Society, Washington, DC, pp 140–150CrossRefGoogle Scholar
  12. 12.
    Li J, Stribling J, Morris R, Kaashoek MF (2005) Bandwidth-efficient management of dht routing tables. In: NSDI’05: Proceedings of the 2nd conference on symposium on networked systems design & implementation. USENIX Association, Berkeley, pp 99–114Google Scholar
  13. 13.
    So KCW, Sirer EG (2007) Latency and bandwidth-minimizing failure detectors. SIGOPS Oper Syst Rev 41(3):89–99CrossRefGoogle Scholar
  14. 14.
    Stutzbach D, Rejaie R (2006) Understanding churn in Peer-to-Peer networks. In: IMC ’06: Proceedings of the 6th ACM SIGCOMM on Internet measurement. ACM, New York, pp 189–202. [Online]. Available:
  15. 15.
    Steiner M, En-Najjary T, Biersack EW (2009) Long term study of peer behavior in the kad dht. IEEE/ACM Trans Netw 17(5):1371–1384CrossRefGoogle Scholar
  16. 16.
    Bustamante FE, Qiao Y (2004) Friendships that last: peer lifespan and its role in P2P protocols. In: Web content caching and distribution: proceedings of the 8th international workshop. Kluwer, Norwell, pp 233–246CrossRefGoogle Scholar
  17. 17.
    Stutzbach D, Rejaie R, Sen S (2005) Characterizing unstructured overlay topologies in modern P2P file-sharing systems. In: Proc. of Internet measurement conference (IMC). [Online]. Available:
  18. 18.
    Shimizu R (1979) On a lack of memory property of the exponential distribution. Ann Inst Stat Math 31(1):309–313MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Saroiu S, Gummadi KP, Gribble SD (2003) Measuring and analyzing the characteristics of napster and gnutella hosts. Multimedia Syst 9(2):170–184CrossRefGoogle Scholar
  20. 20.
    Leonard D, Rai V, Loguinov D (2005) On lifetime-based node failure and stochastic resilience of decentralized Peer-to-Peer networks. In: SIGMETRICS ’05: Proceedings of the 2005 ACM SIGMETRICS international conference on measurement and modeling of computer systems. ACM, New York, pp 26–37CrossRefGoogle Scholar
  21. 21.
    Izal M (2010) Bittorrent traces and tools. World Wide Web,
  22. 22.
    Izal M, Urvoy-Keller G, Biersack EW, Felber P, Al Hamra A, Garcés-Erice L (2004) Dissecting bittorrent: five months in a torrent’s lifetime. In: Barakat C, Pratt I (eds) Passive and active network measurement, ser. Lecture notes in computer science, vol 3015. Springer, Berlin, pp 1–11. [Online]. Available:
  23. 23.
    Saroiu S, Gummadi P, Gribble S (2002) A measurement study of Peer-to-Peer file sharing systems. [Online]. Available:
  24. 24.
    Gummadi KP, Dunn RJ, Saroiu S, Gribble SD, Levy HM, Zahorjan J (2003) Measurement, modeling, and analysis of a Peer-to-Peer file-sharing workload. In: SOSP ’03: Proceedings of the nineteenth ACM symposium on operating systems principles. ACM, New York, pp 314–329CrossRefGoogle Scholar
  25. 25.
    Mahajan R, Castro M, Rowstron A (2003) Controlling the cost of reliability in Peer-to-Peer overlays. In: IPTPSGoogle Scholar
  26. 26.
    Chen W, Toueg S, Aguilera MK (2000) On the quality of service of failure detectors. IEEE Trans Comput 51:561–580MathSciNetCrossRefGoogle Scholar
  27. 27.
    Li J, Stribling J, Morris R, Kaashoek MF, Gil TM (2005) A performance vs. cost framework for evaluating DHT design tradeoffs under churn. In: Proc. of the 24th InfocomGoogle Scholar
  28. 28.
    Bustamante FE, QiaoY (2008) Designing less-structured P2P systems for the expected high churn. IEEE/ACM Trans Netw 16(3):617–627CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Richard Price
    • 1
    Email author
  • Peter Tiňo
    • 1
  • Georgios Theodoropoulos
    • 1
  1. 1.School of Computer ScienceUniversity of BirminghamBirminghamUK

Personalised recommendations