Skip to main content

Separating Data and Control: Asynchronous BFT Storage with 2t + 1 Data Replicas

  • Conference paper
Stabilization, Safety, and Security of Distributed Systems (SSS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8756))

Included in the following conference series:

Abstract

The overhead of Byzantine fault tolerant (BFT) storage is a primary concern that prevents its adoption in practice. The cost stems from the need to maintain at least 3t + 1 copies of the data at different storage replicas in the asynchronous model, so that t Byzantine replica faults can be tolerated. This paper presents MDStore, the first fully asynchronous BFT storage protocol that reduces the number of replicas that store the payload data to as few as 2t + 1 and maintains metadata at 3t + 1 replicas on (possibly) different servers. At the heart of MDStore lies a metadata service built upon a new abstraction called “timestamped storage.” Timestamped storage allows for conditional writes (facilitating the implementation of the metadata service) and has consensus number one (making it implementable with wait-free semantics in an asynchronous system despite faults). In addition to its low replication overhead, MDStore offers strong guarantees by emulating a multi-writer multi-reader atomic register, providing wait-free termination, and tolerating any number of Byzantine readers and crash-faulty writers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abraham, I., Chockler, G., Keidar, I., Malkhi, D.: Byzantine disk Paxos: Optimal resilience with Byzantine shared memory. Distributed Computing 18(5), 387–408 (2006)

    Article  MATH  Google Scholar 

  2. Adya, A., Bolosky, W.J., Castro, M., et al.: FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. In: Proc. 5th Symp. Operating Systems Design and Implementation, OSDI (2002)

    Google Scholar 

  3. Aiyer, A.S., Alvisi, L., Bazzi, R.A.: Bounded wait-free implementation of optimally resilient Byzantine storage without (unproven) cryptographic assumptions. In: Pelc, A. (ed.) DISC 2007. LNCS, vol. 4731, pp. 7–19. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Androulaki, E., Cachin, C., Dobre, D., Vukolić, M.: Erasure-coded Byzantine storage with separate metadata. Report ArXiv:1402.4958, CoRR (2014)

    Google Scholar 

  5. Attiya, H., Welch, J.: Distributed Computing: Fundamentals, Simulations and Advanced Topics. McGraw-Hill, London (1998)

    Google Scholar 

  6. Cachin, C., Dobre, D., Vukolić, M.: BFT storage with 2t + 1 data replicas. Report ArXiv:1305.4868, CoRR (2013)

    Google Scholar 

  7. Cachin, C., Guerraoui, R., Rodrigues, L.: Introduction to Reliable and Secure Distributed Programming, 2nd edn. Springer (2011)

    Google Scholar 

  8. Cachin, C., Junker, B., Sorniotti, A.: On limitations of using cloud storage for data replication. In: Proc. 6th Workshop on Recent Advances in Intrusion Tolerance and reSilience, WRAITS 2012 (2012)

    Google Scholar 

  9. Cachin, C., Tessaro, S.: Optimal resilience for erasure-coded Byzantine distributed storage. In: Proc. International Conference on Dependable Systems and Networks (DSN-DCCS), pp. 115–124 (2006)

    Google Scholar 

  10. Cho, B., Aguilera, M.K.: Surviving congestion in geo-distributed storage systems. In: Proc. USENIX Annual Technical Conference, pp. 439–451 (2012)

    Google Scholar 

  11. Chun, B.-G., Maniatis, P., Shenker, S., Kubiatowicz, J.: Attested append-only memory: Making adversaries stick to their word. In: Proc. 21st ACM Symposium on Operating Systems Principles (SOSP), pp. 189–204 (2007)

    Google Scholar 

  12. Correia, M., Neves, N.F., Veríssimo, P.: How to tolerate half less one Byzantine nodes in practical distributed systems. In: Proc. 23rd Symposium on Reliable Distributed Systems (SRDS), pp. 174–183 (2004)

    Google Scholar 

  13. Dobre, D., Karame, G., Li, W., Majuntke, M., Suri, N., Vukolić, M.: PoWerStore: Proofs of writing for efficient and robust storage. In: Proc. ACM Conference on Computer and Communications Security, CCS (2013)

    Google Scholar 

  14. Dobre, D., Viotti, P., Vukolić, M.: Hybris: Consistency hardening in robust hybrid cloud storage. Research Report RR-13-291, Eurécom (2013)

    Google Scholar 

  15. Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. Journal of the ACM 35(2), 288–323 (1988)

    Article  MathSciNet  Google Scholar 

  16. Fan, R., Lynch, N.A.: Efficient replication of large data objects. In: Fich, F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 75–91. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  18. Guerraoui, R., Vukolić, M.: How fast can a very robust read be? In: Proc. 25th ACM Symposium on Principles of Distributed Computing (PODC), pp. 248–257 (2006)

    Google Scholar 

  19. Guerraoui, R., Vukolić, M.: Refined quorum systems. Distributed Computing 23(1), 1–42 (2010)

    Article  MATH  Google Scholar 

  20. Herlihy, M.: Wait-free synchronization. ACM Transactions on Programming Languages and Systems 11(1), 124–149 (1991)

    Article  Google Scholar 

  21. Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems 12(3), 463–492 (1990)

    Article  Google Scholar 

  22. Kapitza, R., Behl, J., Cachin, C., Distler, T., Kuhnle, S., Mohammadi, S.V., Schröder-Preikschat, W., Stengel, K.: CheapBFT: Resource-efficient Byzantine fault tolerance. In: Proc. 7th European Conference on Computer Systems (EuroSys), pp. 295–308 (April 2012)

    Google Scholar 

  23. Katz, J., Lindell, Y.: Introduction to Modern Cryptography: Principles and Protocols. Chapman & Hall/CRC (2007)

    Google Scholar 

  24. Lamport, L.: On interprocess communication. Distributed Computing 1(2), 77–85, 86–101 (1986)

    Google Scholar 

  25. Lamport, L.: The part-time parliament. ACM Transactions on Computer Systems 16(2), 133–169 (1998)

    Article  Google Scholar 

  26. Lamport, L.: Paxos made simple. SIGACT News 32(4), 51–58 (2001)

    Google Scholar 

  27. Lamport, L.: Lower bounds for asynchronous consensus. In: Schiper, A., Shvartsman, M.M.A.A., Weatherspoon, H., Zhao, B.Y. (eds.) Future Directions in Distributed Computing. LNCS, vol. 2584, pp. 22–23. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  28. Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  29. Lynch, N.A., Tuttle, M.R.: An introduction to input/output automata. CWI Quaterly 2(3), 219–246 (1989)

    MathSciNet  MATH  Google Scholar 

  30. Malkhi, D., Reiter, M.: Secure and scalable replication in Phalanx. In: Proc. 17th Symposium on Reliable Distributed Systems, SRDS (1998)

    Google Scholar 

  31. Malkhi, D., Reiter, M.K.: Byzantine quorum systems. Distributed Computing 11(4), 203–213 (1998)

    Article  Google Scholar 

  32. Martin, J.-P., Alvisi, L., Dahlin, M.: Minimal Byzantine storage. In: Malkhi, D. (ed.) DISC 2002. LNCS, vol. 2508, pp. 311–325. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  33. Veronese, G.S., Correia, M., Bessani, A., Lung, L.C., Veríssimo, P.: Efficient Byzantine fault tolerance. IEEE Transactions on Computers 62(1), 16–30 (2011)

    Article  Google Scholar 

  34. Vukolić, M.: Quorum Systems: With Applications to Storage and Consensus. Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool Publishers (2012)

    Google Scholar 

  35. Wang, Y., Alvisi, L., Dahlin, M.: Gnothi: Separating data and metadata for efficient and available storage replication. In: Proc. USENIX Annual Technical Conference, pp. 413–424 (2012)

    Google Scholar 

  36. Wilkes, J., Hoover, C., Keer, B., Mehra, P., Veitch, A.: Storage, Data, and Information Systems. HP Laboratories (2008)

    Google Scholar 

  37. Yin, J., Martin, J.-P., Venkataramani, A., Alvisi, L., Dahlin, M.: Separating agreement from execution for Byzantine fault-tolerant services. In: Proc. 19th ACM Symposium on Operating Systems Principles (SOSP), pp. 253–268 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cachin, C., Dobre, D., Vukolić, M. (2014). Separating Data and Control: Asynchronous BFT Storage with 2t + 1 Data Replicas. In: Felber, P., Garg, V. (eds) Stabilization, Safety, and Security of Distributed Systems. SSS 2014. Lecture Notes in Computer Science, vol 8756. Springer, Cham. https://doi.org/10.1007/978-3-319-11764-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11764-5_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11763-8

  • Online ISBN: 978-3-319-11764-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics