, Volume 3, Issue 1–4, pp 393–420 | Cite as

A technique for constructing highly available services

  • Rivka Ladin
  • Barbara Liskov
  • Liuba Shrira


This paper describes a general method for constructing a highly available service for use in a distributed system. It gives a specific implementation of the method and proves the implementation correct. The service consists of replicas that reside at several different locations in a network. It presents its clients with a consistent view of its state, but the view may contain old information. Clients can indicate how recent the information must be. The method can be used in applications satisfying certain semantic constraints. For applications that can use it, the method performs better than other replication techniques.

Key words

Distributed systems Algorithms Reliability Availability Data replication 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Birman, K., and Joseph, T. Exploiting Virtual Synchrony in Distributed Systems.Proc. of the Eleventh ACM Symposium on Operating Systems Principles, November, 1987, pp. 123–138.Google Scholar
  2. [2]
    Birrell, A., Levin, R., Needham, R., and Schroeder, M., Grapevine: An Exercise in Distributed Computing.Communications of the Association for Computing Machinery 25, 4 (1982), 260–274.Google Scholar
  3. [3]
    El-Abbadi, A., Skeen, D., and Cristian, F. An Efficient Fault-Tolerant Protocol for Replicated Data Management.Proc. of the Fourth ACM Symposium on Principles of Database Systems, March, 1985, pp. 215–229.Google Scholar
  4. [4]
    El-Abbadi, A., and Toueg, S. Maintaining Availability in Partitioned Replicated Databases.Proc. of the Fifth ACM Symposium on Principles of Database Systems, March, 1986, pp. 240–251.Google Scholar
  5. [5]
    Fischer, M. J., and Michael, A. Sacrificing Serializability to Attain High Availability of Data in an Unreliable Network.Proc. of the Symposium on Principles of Database Systems, ACM, March, 1982, pp. 70–75.Google Scholar
  6. [6]
    Gifford, D. K. Weighted Voting for Replicated Data.Proc. of the Seventh Symposium on Operating Systems Principles, December, 1979, pp. 150–162.Google Scholar
  7. [7]
    Gray, J. N. Notes on Data Base Operating Systems. InOperating Systems—An Advanced Course, Bayer, R., Graham, R. M., and Seegmuller, G. (Eds.). Lecture Notes in Computer Science, Vol. 60. Springer-Verlag, Berlin, 1978, pp. 393–481.Google Scholar
  8. [8]
    Hwang, D. Constructing a Highly-Available Location Service for a Distributed Environment. S.M. Thesis, M.I.T. Department of Electrical Engineering and Computer Science, Cambridge, MA, December, 1987.Google Scholar
  9. [9]
    Lampson, B. W., and Sturgis, H. E. Crash Recovery in a Distributed Data Storage System. Xerox Research Center, Palo Alto, CA, 1979.Google Scholar
  10. [10]
    Liskov, B. Overview of the Argus Language and System. Programming Methodology Group Memo 40, M.I.T. Laboratory for Computer Science, Cambridge, MA, February, 1984.Google Scholar
  11. [11]
    Liskov, B., and Ladin, R. Highly-Available Distributed Services and Fault-Tolerant Distributed Garbage Collection.Proc. of the Fifth ACM Symposium on Principles of Distributed Computing, August, 1986, pp. 29–39.Google Scholar
  12. [12]
    Liskov, B., and Scheifler, R. W., Guardians and Actions: Linguistic Support for Robust, Distributed Programs.ACM Transactions on Programming Languages and Systems 5, 3 (1983), 381–404.zbMATHCrossRefGoogle Scholar
  13. [13]
    Liskov, B., Scheifler, R., Walker, E., and Weihl, W. Orphan Detection. Programming Methodology Group Memo 53, M.I.T. Laboratory for Computer Science, Cambridge, MA, 1987. Also published inProc. of the Seventeenth International Symposium on Fault-Tolerant Computing, July, 1987, pp. 2–7.Google Scholar
  14. [14]
    Lundelius, J. Synchronizing Clocks in a Distributed System. Technical Report M1T/LCS/TR335, M.I.T. Laboratory for Computer Science, Cambridge, MA, 1984.Google Scholar
  15. [15]
    Marzullo, K. Loosely-Coupled Distributed Services: A Distributed Time Service. Ph.D. Thesis, Stanford University, Stanford, CA, 1983.Google Scholar
  16. [16]
    Parker, D. S., Popek, G. J., Rudisin, G., Stoughton, A., Walker, B., Walton, E., Chow, J., Edwards, D., Kiser, S., and Kline, C., Detection of Mutual Inconsistency in Distributed Systems.IEEE Transactions on Software Engineering 9 (1983), 240–247.CrossRefGoogle Scholar
  17. [17]
    Schlichting, R. D., and Schneider, F. B., Fail-Stop Processors: An Approach to Designing Fault-Tolerant Computing Systems.ACM Transactions on Computing Systems 1, 3 (1983), 222–238.CrossRefGoogle Scholar
  18. [18]
    Walker, E. W. Orphan Detection in the Argus System. Technical Report MIT/LCS/TR326, M.I.T. Laboratory for Computer Science, Cambridge, MA, June, 1984.Google Scholar
  19. [19]
    Weihl, W., Distributed Version Management for Read-only Actions.IEEE Transactions on Software Engineering, Special Issue on Distributed Systems,13, 1 (1987), 55–64.Google Scholar
  20. [20]
    Wuu, G. T. J., and Bernstein, A. J. Efficient Solutions to the Replicated Log and Dictionary Problems.Proc. of the Third Annual Symposium on Principles of Distributed Computing, August, 1984, pp. 233–242.Google Scholar

Copyright information

© Springer-Verlag New York Inc. 1988

Authors and Affiliations

  • Rivka Ladin
    • 1
  • Barbara Liskov
    • 1
  • Liuba Shrira
    • 1
  1. 1.Laboratory for Computer Science, Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations