# A coded shared atomic memory algorithm for message passing architectures

- 349 Downloads
- 1 Citations

## Abstract

This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) we present an atomic shared-memory emulation algorithm that we call *Coded Atomic Storage* (CAS). This algorithm uses *erasure coding* methods. In a storage system with *N* servers that is resilient to *f* server failures, we show that the communication cost of CAS is \(\frac{N}{N-2f}\). The storage cost of CAS is unbounded. (2) We present a modification of the CAS algorithm known as CAS with garbage collection (CASGC). The CASGC algorithm is parameterized by an integer \(\delta \) and has a bounded storage cost. We show that the CASGC algorithm satisfies atomicity. In every execution of CASGC where the number of server failures is no bigger than *f*, we show that every write operation invoked at a non-failing client terminates. We also show that in an execution of CASGC with parameter \(\delta \) where the number of server failures is no bigger than *f*, a read operation terminates provided that the number of write operations that are concurrent with the read is no bigger than \(\delta \). We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CAS. (3) We describe an algorithm known as the Communication Cost Optimal Atomic Storage (CCOAS) algorithm that achieves a smaller communication cost than CAS and CASGC. In particular, CCOAS incurs read and write communication costs of \(\frac{N}{N-f}\) measured in terms of number of object values. We also discuss drawbacks of CCOAS as compared with CAS and CASGC.

## Keywords

Shared memory emulation Erasure coding Multi-writer multi-reader atomic register Concurrent read and write operations Storage efficiency## References

- 1.Common RAID disk data format specification. SNIA, Advanced Storage and Information Technology Standard, version 2 (2009)Google Scholar
- 2.Abd-El-Malek, M., Ganger, G.R., Goodson, G.R., Reiter, M.K., Wylie, J.J.: Fault-scalable byzantine fault-tolerant services. ACM SIGOPS Oper. Syst. Rev.
**39**, 59–74 (2005)CrossRefGoogle Scholar - 3.Agrawal, A., Jalote, P.: Coding-based replication schemes for distributed systems. IEEE Trans. Parallel Distrib. Syst.
**6**(3), 240–251 (1995). doi: 10.1109/71.372774 CrossRefGoogle Scholar - 4.Aguilera, M.K., Janakiraman, R., Xu, L.: Using erasure codes efficiently for storage in a distributed system. In: Proceedings of International Conference on Dependable Systems and Networks (DSN), pp. 336–345. IEEE (2005)Google Scholar
- 5.Aguilera, M.K., Keidar, I., Malkhi, D., Shraer, A.: Dynamic atomic storage without consensus. J. ACM
**58**, 7:1–7:32 (2011). doi: 10.1145/1944345.1944348 MathSciNetCrossRefzbMATHGoogle Scholar - 6.Anderson, E., Li, X., Merchant, A., Shah, M.A., Smathers, K., Tucek, J., Uysal, M., Wylie, J.J.: Efficient eventual consistency in pahoehoe, an erasure-coded key-blob archive. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 181–190. IEEE (2010)Google Scholar
- 7.Androulaki, E., Cachin, C., Dobre, D., Vukolić, M.: Erasure-coded byzantine storage with separate metadata. In: Aguilera, M.K., et al. (eds.) Principles of Distributed Systems. 18th International Conference, OPODIS 2014, Cortina d’Ampezzo, Italy, December 16–19, 2014. Proceedings, pp. 76–90. Springer, New York (2014)Google Scholar
- 8.Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message-passing systems. J. ACM (JACM)
**42**(1), 124–142 (1995)CrossRefzbMATHGoogle Scholar - 9.Cachin, C., Tessaro, S.: Asynchronous verifiable information dispersal. In: Fraigniaud, P. (ed.) Distributed Computing. 19th International Conference, DISC 2005, Cracow, Poland, September 26–29, 2005. Proceedings, pp. 503–504. Springer, Berlin, Heidelberg (2005)Google Scholar
- 10.Cachin, C., Tessaro, S.: Optimal resilience for erasure-coded byzantine distributed storage. In: 2006 International Conference on Dependable Systems and Networks (DSN), pp. 115–124. IEEE (2006)Google Scholar
- 11.Cadambe, V.R., Lynch, N., Medard, M., Musial, P.: A coded shared atomic memory algorithm for message passing architectures. In: 13th International Symposium on Network Computing and Applications (NCA), pp. 253–260. IEEE (2014)Google Scholar
- 12.Cassuto, Y.: What can coding theory do for storage systems? ACM SIGACT News
**44**(1), 80–88 (2013)MathSciNetCrossRefGoogle Scholar - 13.Datta, A., Oggier, F.: An overview of codes tailor-made for better repairability in networked distributed storage systems. ACM SIGACT News
**44**(1), 89–105 (2013)MathSciNetCrossRefGoogle Scholar - 14.Dobre, D., Karame, G., Li, W., Majuntke, M., Suri, N., Vukolić, M.: PoWerStore: proofs of writing for efficient and robust storage. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications security, pp. 285–298. ACM (2013)Google Scholar
- 15.Dutta, P., Guerraoui, R., Levy, R.R.: Optimistic erasure-coded distributed storage. In: Taubenfeld, G. (ed.) Distributed Computing. 22nd International Symposium, DISC 2008, Arcachon, France, September 22–24, 2008. Proceedings, pp. 182–196. Springer, New York (2008)Google Scholar
- 16.Fan, R., Lynch, N.: Efficient replication of large data objects. In: Proceedings of the 17th International Symposium on Distributed Computing (DISC), pp. 75–91 (2003)Google Scholar
- 17.Fekete, A., Lynch, N., Shvartsman, A.: Specifying and using a partitionable group communication service. ACM Trans. Comput. Syst.
**19**(2), 171–216 (2001). doi: 10.1145/377769.377776 - 18.Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles, SOSP ’79, pp. 150–162. ACM, New York (1979). doi: 10.1145/800215.806583
- 19.Gilbert, S., Lynch, N., Shvartsman, A.: RAMBO: a robust, reconfigurable atomic memory service for dynamic networks. Distrib. Comput.
**23**(4), 225–272 (2010)CrossRefzbMATHGoogle Scholar - 20.Goodson, G.R., Wylie, J.J., Ganger, G.R., Reiter, M.K.: Efficient byzantine-tolerant erasure-coded storage. In: 2004 International Conference on Dependable Systems and Networks, pp. 135–144. IEEE (2004)Google Scholar
- 21.Hendricks, J., Ganger, G.R., Reiter, M.K.: Low-overhead Byzantine fault-tolerant storage. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles (SOSP) vol. 41, no. 6, pp. 73–86 (2007)Google Scholar
- 22.Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst.
**12**, 463–492 (1990). doi: 10.1145/78969.78972 CrossRefGoogle Scholar - 23.Lamport, L.: On interprocess communication. Part I: basic formalism. Distrib. Comput.
**2**(1), 77–85 (1986)CrossRefzbMATHGoogle Scholar - 24.Lin, S., Costello, D.J.: Error Control Coding, 2nd edn. Prentice-Hall, Upper Saddle River (2004)zbMATHGoogle Scholar
- 25.Lynch, N., Shvartsman, A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: Twenty-Seventh Annual International Symposium on Fault-Tolerant Computing, FTCS-27. Digest of Papers, pp. 272–281. IEEE (1997)Google Scholar
- 26.Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco (1996)zbMATHGoogle Scholar
- 27.Lynch, N.A., Tuttle, M.R.: An introduction to input/output automata. CWI Q.
**2**, 219–246 (1989)MathSciNetzbMATHGoogle Scholar - 28.Malkhi, D., Reiter, M.: Byzantine quorum systems. Distrib. Comput.
**11**(4), 203–213 (1998). doi: 10.1007/s004460050050 CrossRefzbMATHGoogle Scholar - 29.Martin, J.P., Alvisi, L., Dahlin, M.: Minimal byzantine storage. In: Malkhi, D. (ed.) Distributed Computing. 16th International Conference, DISC 2002, Toulouse, France, October 28–30, 2002. Proceedings, pp. 311–325. Springer, New York (2002)Google Scholar
- 30.Plank, J.S.: T1: erasure codes for storage applications. In: Proceedings of the 4th USENIX Conference on File and Storage Technologies, pp. 1–74 (2005)Google Scholar
- 31.Reed, I.S., Solomon, G.: Polynomial codes over certain finite fields. J. Soc. Ind. Appl. Math.
**8**(2), 300–304 (1960)MathSciNetCrossRefzbMATHGoogle Scholar - 32.Roth, R.: Introduction to Coding Theory. Cambridge University Press, Cambridge (2006)CrossRefzbMATHGoogle Scholar
- 33.Saito, Y., Frølund, S., Veitch, A., Merchant, A., Spence, S.: Fab: building distributed enterprise disk arrays from commodity components. In: ACM SIGARCH Computer Architecture News, vol. 32, pp. 48–58. ACM (2004)Google Scholar
- 34.Thomas, R.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst.
**4**(2), 180–209 (1979)CrossRefGoogle Scholar - 35.Vukolić, M.: Quorum systems: with applications to storage and consensus. Synth. Lect. Distrib. Comput. Theory
**3**(1), 1–146 (2012). doi: 10.2200/S00402ED1V01Y201202DCT009 CrossRefGoogle Scholar - 36.Wang, Z., Cadambe, V.R.: Multi-version coding in distributed storage. In: 2014 IEEE International Symposium on Information Theory (ISIT) (2014)Google Scholar