Transaction commit in a realistic timing model
Animportant problem in the construction of fault-tolerant distributed database systems is the design of nonblocking transaction commit protocols. This problem has been extensively studied for synchronous systems (i.e., systems where no messages ever arrive late). In this paper, the synchrony assumption is relaxed. A new partially synchronous timing model is described. Developed for this model is a new nonblocking randomized transaction commit protocol, which incorporates an agreement protocol of Ben-Or. The new protocol works as long as fewer than half the processors fail. A matching lower bound is proved, showing that the number of processor faults tolerated is optimal. If half or more of the processors fail, the protocol degrades gracefully: it blocks, but no processor produces a wrong answer. A notion of asynchronous round is defined, and the protocol is shown to terminate in a small constant expected number of asynchronous rounds. In contrast it is shown that no protocol in this model can guarantee that a processor terminates in a bounded expected number of its own steps, even if processors are synchronous.
Key wordsDistributed databases Fault tolerance Lower bounds Randomized protocols Time bounds Transaction commit
Unable to display preview. Download preview PDF.
- 1.Ben-Or M: Another advantage of free choice: Completely asynchronous agreement protocols. In: Proc 2nd Annu ACM Symp Principles Distrib Comput 1983, pp 27–30Google Scholar
- 2.Coan BA, Lundelius J: Transaction commit in a realistic fault model. In: Proc 5th Annu ACM Symp Principles Distrib Comput 1986, pp 40–51Google Scholar
- 3.Chor B, Merritt M, Shmoys D: Simple constant-time consensus protocols in realistic failure models. J ACM 36:591–614 (1989)Google Scholar
- 4.Dolev D, Dwork C, Stockmeyer L: On the minimal synchronism needed for distributed consensus. J ACM 36:77–97 (1987)Google Scholar
- 5.Dwork C, Lynch NA, Stockmeyer L: Consensus in the presence of partial synchrony. J ACM 35:288–323 (1988)Google Scholar
- 6.Dwork C, Skeen D: The inherent cost of nonblocking commitment. In: Proc 2nd Annu ACM Symp Principles Distrib Comput 1983, pp 1–11Google Scholar
- 7.Dwork C, Skeen D: Patterns of communication in consensus protocols. In: Proc 3rd Annu ACM Symp Principles Distrib Comput 1984, pp 143–153Google Scholar
- 8.Fischer MJ, Lynch NA, Paterson MS: Impossibility of distributed consensus with one faulty process. J ACM 32:374–382 (1985)Google Scholar
- 9.Gray J: Notes on database operating systems. In: Bayer R, Graham RM, Seegmüller G (eds) Operating systems: an advanced course. Lect Notes Comput Sci, vol 60 Springer, Berlin Heidelberg New York 1978, pp 393–481Google Scholar
- 10.Halpern JY, Moses YO. Knowledge and common knowledge in a distributed environment. In: Proc 3rd Annu ACM Symp Principles Distrib Comput 1984, pp 50–61 (revised as of Jan. 1986 as IBM-RJ-4421)Google Scholar
- 11.Rabin MO: Randomized Byzantine generals. In: Proc 24th Annu IEEE Symp Found Comput Sci 1983, pp 403–409Google Scholar
- 12.Skeen D: Crash recovery in a distributed database system. Ph.D. dissertation, University of California, Berkeley 1982 (available as UCB/ERL M82/45)Google Scholar