DISC 2002: Distributed Computing pp 311-325 | Cite as
Minimal Byzantine Storage
Abstract
Byzantine fault-tolerant storage systems can provide high availability in hazardous environments, but the redundant servers they require increase software development and hardware costs. In order to minimize the number of servers required to implement fault-tolerant storage services, we develop a new algorithm that uses a “Listeners” pattern of network communication to detect and resolve ordering ambiguities created by concurrent accesses to the system. Our protocol requires 3f + 1 servers to tolerate up to f Byzantine faults—f fewer than the 4f + 1 required by existing protocols for non-self-verifying data. In addition, SBQ-L provides atomic consistency semantics, which is stronger than the regular or pseudo-atomic semantics provided by these existing protocols. We show that this protocol is optimal in the number of servers— any protocol that provides safe semantics or stronger requires at least 3f + 1 servers to tolerate f Byzantine faults in an asynchronous system. Finally, we examine a non-confirmable writes variation of the SBQ-L protocol where a client cannot determine when its writes complete. We show that SBQ-L with non-confirmable writes provides regular semantics with 2f + 1 servers and that this number of servers is minimal.
Keywords
Shared Memory Read Operation Read Request Quorum System Additional MessagePreview
Unable to display preview. Download preview PDF.
References
- 1.L. Alvisi, D. Malkhi, E. Pierce, and R. Wright. Dynamic Byzantine quorum systems. In Proceedings of the International Conference on Dependable Systems and Networks, June 2000.Google Scholar
- 2.H. Attiya, A. Bar-Noy, and D. Dolev. Sharing memory robustly in message passing systems. Journal of the ACM (JACM) Volume 42, pages 124–142, 1995.MATHCrossRefGoogle Scholar
- 3.R. A. Bazzi. Synchronous Byzantine quorum systems. In Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing, pages 259–266, 1997.Google Scholar
- 4.R. A. Bazzi. Access cost for asynchronous Byzantine quorum systems. Distributed Computing Journal volume 14, Issue 1, pages 41–48, January 2001.CrossRefMathSciNetGoogle Scholar
- 5.M. Castro and NB. Liskov. Practical Byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI’99), New Orleans, USA, pages 173–186, February 1999.Google Scholar
- 6.S. Davidson, H. Garcia-Molina, and D. Skeen. Consistency in a partitioned network: a survey. ACM Computing Surveys (CSUR) Volume 17, Issue 3, pages 341–370, September 1985.CrossRefGoogle Scholar
- 7.M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Technical Report MIT/LCS/TR-282, 1982.Google Scholar
- 8.E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns. Addison Wesley, October 1994. ISBN 0-201-63361-2.Google Scholar
- 9.L. Lamport. On interprocess communications. Distributed Computing, pages 77–101, 1986.Google Scholar
- 10.Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558–565, July 1978.Google Scholar
- 11.D. Malkhi and M. Reiter. Byzantine quorum systems. Distributed Computing, pages 203–213, 1998.Google Scholar
- 12.D. Malkhi and M. Reiter. Secure and scalable replication in phalanx. In Proc. 17th IEEE Symposium on Reliable Distributed Systems, West Lafayette, Indiana, USA, Oct 1998.Google Scholar
- 13.D. Malkhi, M. Reiter, and A. Wool. The load and availability of Byzantine quorum systems. In Proceedings 16th ACM Symposium on Principles of Distributed Computing (PODC), pages 249–257, August 1997.Google Scholar
- 14.J-P. Martin, L. Alvisi, and M. Dahlin. Minimal Byzantine storage. Technical Report TR-02-38, University of Texas at Austin, Department of Computer Sciences, August 2002.Google Scholar
- 15.J-P. Martin, L. Alvisi, and M. Dahlin. Small Byzantine quorum systems.In Proceedings of the International Conference on Dependable Systems and Networks, pages 374–383, June 2002.Google Scholar
- 16.M. Naor and A. Wool. The load, capacity, and availability of quorum systems. SI AM Journal on Computing, 27(2):423–447, 1998.MATHCrossRefMathSciNetGoogle Scholar
- 17.E. Pierce and L. Alvisi. A recipe for atomic semantics for Byzantine quorum systems. Technical report, University of Texas at Austin, Department of Computer Sciences, May 2000.Google Scholar
- 18.R. Rodrigues, M. Castro, and B. Liskov. BASE: Using abstraction to improve fault tolerance. In Proceedings of the 18th Symposium on Operating Systems Principles (SOSP’ 01), October 2001.Google Scholar