Distributed Computing

, Volume 30, Issue 1, pp 49–73 | Cite as

A coded shared atomic memory algorithm for message passing architectures

  • Viveck R. CadambeEmail author
  • Nancy Lynch
  • Muriel Mèdard
  • Peter Musial


This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) we present an atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with N servers that is resilient to f server failures, we show that the communication cost of CAS is \(\frac{N}{N-2f}\). The storage cost of CAS is unbounded. (2) We present a modification of the CAS algorithm known as CAS with garbage collection (CASGC). The CASGC algorithm is parameterized by an integer \(\delta \) and has a bounded storage cost. We show that the CASGC algorithm satisfies atomicity. In every execution of CASGC where the number of server failures is no bigger than f, we show that every write operation invoked at a non-failing client terminates. We also show that in an execution of CASGC with parameter \(\delta \) where the number of server failures is no bigger than f,  a read operation terminates provided that the number of write operations that are concurrent with the read is no bigger than \(\delta \). We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CAS. (3) We describe an algorithm known as the Communication Cost Optimal Atomic Storage (CCOAS) algorithm that achieves a smaller communication cost than CAS and CASGC. In particular, CCOAS incurs read and write communication costs of \(\frac{N}{N-f}\) measured in terms of number of object values. We also discuss drawbacks of CCOAS as compared with CAS and CASGC.


Shared memory emulation Erasure coding Multi-writer multi-reader atomic register Concurrent read and write operations Storage efficiency 


  1. 1.
    Common RAID disk data format specification. SNIA, Advanced Storage and Information Technology Standard, version 2 (2009)Google Scholar
  2. 2.
    Abd-El-Malek, M., Ganger, G.R., Goodson, G.R., Reiter, M.K., Wylie, J.J.: Fault-scalable byzantine fault-tolerant services. ACM SIGOPS Oper. Syst. Rev. 39, 59–74 (2005)CrossRefGoogle Scholar
  3. 3.
    Agrawal, A., Jalote, P.: Coding-based replication schemes for distributed systems. IEEE Trans. Parallel Distrib. Syst. 6(3), 240–251 (1995). doi: 10.1109/71.372774 CrossRefGoogle Scholar
  4. 4.
    Aguilera, M.K., Janakiraman, R., Xu, L.: Using erasure codes efficiently for storage in a distributed system. In: Proceedings of International Conference on Dependable Systems and Networks (DSN), pp. 336–345. IEEE (2005)Google Scholar
  5. 5.
    Aguilera, M.K., Keidar, I., Malkhi, D., Shraer, A.: Dynamic atomic storage without consensus. J. ACM 58, 7:1–7:32 (2011). doi: 10.1145/1944345.1944348 MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Anderson, E., Li, X., Merchant, A., Shah, M.A., Smathers, K., Tucek, J., Uysal, M., Wylie, J.J.: Efficient eventual consistency in pahoehoe, an erasure-coded key-blob archive. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 181–190. IEEE (2010)Google Scholar
  7. 7.
    Androulaki, E., Cachin, C., Dobre, D., Vukolić, M.: Erasure-coded byzantine storage with separate metadata. In: Aguilera, M.K., et al. (eds.) Principles of Distributed Systems. 18th International Conference, OPODIS 2014, Cortina d’Ampezzo, Italy, December 16–19, 2014. Proceedings, pp. 76–90. Springer, New York (2014)Google Scholar
  8. 8.
    Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message-passing systems. J. ACM (JACM) 42(1), 124–142 (1995)CrossRefzbMATHGoogle Scholar
  9. 9.
    Cachin, C., Tessaro, S.: Asynchronous verifiable information dispersal. In: Fraigniaud, P. (ed.) Distributed Computing. 19th International Conference, DISC 2005, Cracow, Poland, September 26–29, 2005. Proceedings, pp. 503–504. Springer, Berlin, Heidelberg (2005)Google Scholar
  10. 10.
    Cachin, C., Tessaro, S.: Optimal resilience for erasure-coded byzantine distributed storage. In: 2006 International Conference on Dependable Systems and Networks (DSN), pp. 115–124. IEEE (2006)Google Scholar
  11. 11.
    Cadambe, V.R., Lynch, N., Medard, M., Musial, P.: A coded shared atomic memory algorithm for message passing architectures. In: 13th International Symposium on Network Computing and Applications (NCA), pp. 253–260. IEEE (2014)Google Scholar
  12. 12.
    Cassuto, Y.: What can coding theory do for storage systems? ACM SIGACT News 44(1), 80–88 (2013)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Datta, A., Oggier, F.: An overview of codes tailor-made for better repairability in networked distributed storage systems. ACM SIGACT News 44(1), 89–105 (2013)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Dobre, D., Karame, G., Li, W., Majuntke, M., Suri, N., Vukolić, M.: PoWerStore: proofs of writing for efficient and robust storage. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications security, pp. 285–298. ACM (2013)Google Scholar
  15. 15.
    Dutta, P., Guerraoui, R., Levy, R.R.: Optimistic erasure-coded distributed storage. In: Taubenfeld, G. (ed.) Distributed Computing. 22nd International Symposium, DISC 2008, Arcachon, France, September 22–24, 2008. Proceedings, pp. 182–196. Springer, New York (2008)Google Scholar
  16. 16.
    Fan, R., Lynch, N.: Efficient replication of large data objects. In: Proceedings of the 17th International Symposium on Distributed Computing (DISC), pp. 75–91 (2003)Google Scholar
  17. 17.
    Fekete, A., Lynch, N., Shvartsman, A.: Specifying and using a partitionable group communication service. ACM Trans. Comput. Syst. 19(2), 171–216 (2001). doi: 10.1145/377769.377776
  18. 18.
    Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles, SOSP ’79, pp. 150–162. ACM, New York (1979). doi: 10.1145/800215.806583
  19. 19.
    Gilbert, S., Lynch, N., Shvartsman, A.: RAMBO: a robust, reconfigurable atomic memory service for dynamic networks. Distrib. Comput. 23(4), 225–272 (2010)CrossRefzbMATHGoogle Scholar
  20. 20.
    Goodson, G.R., Wylie, J.J., Ganger, G.R., Reiter, M.K.: Efficient byzantine-tolerant erasure-coded storage. In: 2004 International Conference on Dependable Systems and Networks, pp. 135–144. IEEE (2004)Google Scholar
  21. 21.
    Hendricks, J., Ganger, G.R., Reiter, M.K.: Low-overhead Byzantine fault-tolerant storage. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles (SOSP) vol. 41, no. 6, pp. 73–86 (2007)Google Scholar
  22. 22.
    Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12, 463–492 (1990). doi: 10.1145/78969.78972 CrossRefGoogle Scholar
  23. 23.
    Lamport, L.: On interprocess communication. Part I: basic formalism. Distrib. Comput. 2(1), 77–85 (1986)CrossRefzbMATHGoogle Scholar
  24. 24.
    Lin, S., Costello, D.J.: Error Control Coding, 2nd edn. Prentice-Hall, Upper Saddle River (2004)zbMATHGoogle Scholar
  25. 25.
    Lynch, N., Shvartsman, A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing, FTCS-27. Digest of Papers, pp. 272–281. IEEE (1997)Google Scholar
  26. 26.
    Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco (1996)zbMATHGoogle Scholar
  27. 27.
    Lynch, N.A., Tuttle, M.R.: An introduction to input/output automata. CWI Q. 2, 219–246 (1989)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Malkhi, D., Reiter, M.: Byzantine quorum systems. Distrib. Comput. 11(4), 203–213 (1998). doi: 10.1007/s004460050050 CrossRefzbMATHGoogle Scholar
  29. 29.
    Martin, J.P., Alvisi, L., Dahlin, M.: Minimal byzantine storage. In: Malkhi, D. (ed.) Distributed Computing. 16th International Conference, DISC 2002, Toulouse, France, October 28–30, 2002. Proceedings, pp. 311–325. Springer, New York (2002)Google Scholar
  30. 30.
    Plank, J.S.: T1: erasure codes for storage applications. In: Proceedings of the 4th USENIX Conference on File and Storage Technologies, pp. 1–74 (2005)Google Scholar
  31. 31.
    Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math. 8(2), 300–304 (1960)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Roth, R.: Introduction to Coding Theory. Cambridge University Press, Cambridge (2006)CrossRefzbMATHGoogle Scholar
  33. 33.
    Saito, Y., Frølund, S., Veitch, A., Merchant, A., Spence, S.: Fab: building distributed enterprise disk arrays from commodity components. In: ACM SIGARCH Computer Architecture News, vol. 32, pp. 48–58. ACM (2004)Google Scholar
  34. 34.
    Thomas, R.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst. 4(2), 180–209 (1979)CrossRefGoogle Scholar
  35. 35.
    Vukolić, M.: Quorum systems: with applications to storage and consensus. Synth. Lect. Distrib. Comput. Theory 3(1), 1–146 (2012). doi: 10.2200/S00402ED1V01Y201202DCT009 CrossRefGoogle Scholar
  36. 36.
    Wang, Z., Cadambe, V.R.: Multi-version coding in distributed storage. In: 2014 IEEE International Symposium on Information Theory (ISIT) (2014)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Viveck R. Cadambe
    • 1
    Email author
  • Nancy Lynch
    • 2
  • Muriel Mèdard
    • 3
  • Peter Musial
    • 4
  1. 1.Department of Electrical EngineeringPennsylvania State UniversityUniversity ParkUSA
  2. 2.Computer Science and Artificial Intelligence Laboratory (CSAIL), Department of Electrical Engineering and Computer ScienceMassachusetts Institute of Technology (MIT)CambridgeUSA
  3. 3.Research Laboratory of Electronics (RLE), Department of Electrical Engineering and Computer ScienceMassachusetts Institute of Technology (MIT)CambridgeUSA
  4. 4.Advanced Storage DivisionEMC CorporationCambridgeUSA

Personalised recommendations