Efficient Synchronization of Replicated Data in Distributed Systems

  • Thorsten Schütt
  • Florian Schintke
  • Alexander Reinefeld
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2657)


We present nsync, a tool for synchronizing large replicated data sets in distributed systems. nsync computes nearly optimal synchronization plans based on a hierarchy of gossip algorithms that take the network topology into account. Our primary design goals were maximum performance and maximum scalability. We achieved these goals by exploiting parallelism in the planning and the synchronization phase, by omitting transfer of unnecessary metadata, by synchronizing at a block level rather than a file level, and by using sophisticated compression methods. With its relaxed consistency semantic, nsync neither needs a master copy nor a quorum for updating distributed replicas. Each replica is kept as an autonomous entity and can be modified with the usual tools.


Replicate Data Synchronization Process Broadcast Tree Proxy Node Storage Resource Broker 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    B. Baker and R. Shostak. Gossips and telephones. Discrete Mathematics, 2:191–193, 1972.zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    C. Baru, R. Moore, A. Rajasekar, and M. Wan. The SDSC Storage Resource Broker. In Proceedings of CASCON’98, Toronto, Canada, November 1998.Google Scholar
  3. 3.
    B. Dempsey and D. Weiss. On the performance and scalability of a data mirroring approach for I2-DSI. In Network Storage Symposium, 1999.Google Scholar
  4. 4.
    A. Chervenak et al. Giggle: A framework for constructing scalable replica location services. In Proceedings of the SC 2002, Baltimore, Maryland, November 2002.Google Scholar
  5. 5.
    P. Fraigniaud and E. Lazard. Methods and problems of communication in usual networks. In Discrete Applied Mathematics, volume 53, pages 79–133, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    R. G. Guy, P. L. Reiher, D. Ratner, M. Gunter, W. Ma, and G. J. Popek. Rumor: Mobile data access through optimistic peer-to-peer replication. In ER Workshops, pages 254–265, 1998.Google Scholar
  7. 7.
    J. Hromkovic, C. Klasing, B. Monien, and R. Peine. Dissemination of information in interconnection networks. Combinatorial Network Theory, pages 125–212, 1995.Google Scholar
  8. 8.
    R. Jiménez-Peris, M. Patiño-Martínez, G. Alonso, and B. Kemme. How to select a replication protocol according to scalability, availability, and communication overhead. In IEEE Int. Conf. on Reliable Distrib. Systems (SRDS’01), New Orleans, October 2001. IEEE CS Press.Google Scholar
  9. 9.
    D. W. Krumme, G. Cybenko, and K. N. Venkataraman. Gossiping in minimal time. SIAM J. Comput., 21(1):111–139, 1992.zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Globus Project.
  11. 11.
    GridLab Project.
  12. 12.
    S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. Ascalable content addressable network. In Proceedings of ACM SIGCOMM 2001, 2001.Google Scholar
  13. 13.
    M. Ripeanu and I. Foster. A decentralized, adaptive, replica location service. In Proceedings of 11th IEEE Int. Symp. on High Performance Distributed Compuing (HPDC-11), July 2002.Google Scholar
  14. 14.
    F. Schintke and A. Reinefeld. On the cost of reliability in large data grids. Technical Report ZR-02-52, Zuse Institute Berlin (ZIB), December 2002.Google Scholar
  15. 15.
    T. Schütt. Synchronisation von verteilten Verzeichnisstrukturen. Diploma Thesis, 2002.Google Scholar
  16. 16.
    I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of the ACM SIGCOMM’ 01 Conference, San Diego, California, August 2001.Google Scholar
  17. 17.
    A. Tridgell. Efficient Algorithms for Sorting and Synchronization. PhD thesis, Australian National University, 1999.Google Scholar
  18. 18.
    H. Yu and A. Vahdat. The costs and limits of availability for replicated services. In Proc. of the 18th ACM Symposium on Operating Systems Principles, pages 29–42. ACM Press, 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Thorsten Schütt
    • 1
  • Florian Schintke
    • 1
  • Alexander Reinefeld
    • 1
  1. 1.Zuse Institute Berlin (ZIB)Germany

Personalised recommendations