In a distributed database system (DDBS), failures in the midst of a transaction processing (such as failure of a site where a subtransaction is being processed) may lead to an inconsistent database. As such, a recovery subsystem is an essential component of a DDBS . To ensure correctness, recovery mechanisms must be in place to ensure transaction atomicity and durability even in the midst of failures.
Distributed recovery is more complicated than centralized database recovery because failures can occur at the communication links or a remote site. Ideally, a recovery system should be simple, incur tolerable overhead, maintain system consistency, provide partial operability, and avoid global rollback .
A DDBS must be reliable for it to be useful. In particular, a reliable DDBS must guarantee transaction atomicity and...
- 1.Chrysanthis PK, Samaras G, Al-Houmaily YJ. Hsu recovery and performance of atomic commit processing in distributed database systems. In: Kumar V, Hsu M, editors. Recovery mechanisms in database systems. Upper Saddle River: Prentice-Hall; 1998. Chapter 13.Google Scholar
- 2.Gore M, Ghosh RK. Recovery of mobile transactions. Proceedings DEXA 2000 workshop; 2000. p. 23–7.Google Scholar
- 3.Gray J. Notes on data base operating systems. In: Bayer R, Graham R, Seegmuller G, editors. Operating systems – an advanced course. LNCS, vol. 60, Springer; 1978. p. 393–481.Google Scholar
- 5.Hvasshovd S, Torbjornsen O, Bratsberg S, Holager P. The clustra telecom database: high availability, high throughput, and real-time response. Proceedings 21th international conference on very large data bases; 1995. p. 469–77.Google Scholar
- 6.Isloor SS, Marsland TA System recovery in distributed databases. Proceedings of the 3rd international computer software applications conference, Chicago; 1979. p. 421–26.Google Scholar
- 7.Jimensez-Peris R, Patino-Martinez M, Alonso G. An algorithm for non-intrusive, parallel recovery of replicated data and its correctness. Proceedings of the 21st symposium on reliable distributed systems, Vienna; 2002. p. 150–9.Google Scholar
- 8.Lampson B, Sturgis H. Crash recovery in a distributed data storage system. Technical report, Computer Science Laboratory, Xerox Palo Alto Research Center, California; 1976.Google Scholar
- 9.Lau E, Madden S. An integrated approach to recovery and high availability in an updatable, distributed data warehouse. Proceedings of the 32nd international conference on very large data bases; 2006. p. 12–5.Google Scholar
- 11.Lomet D. Consistent timestamping for transactions in distributed systems. Technical Report CRL90/3, Cambridge Research Laboratory, Digital Equipment Corporation; 1990.Google Scholar
- 13.Özsu MT, Valduriez P. Principles of distributed database systems. 2nd ed. New York: Prentice-Hall; 1999.Google Scholar
- 14.Skeen D. Non-blocking commit protocols. Proceedings of the ACM SIGMOD international conference on management of data, Ann Arbor; 1981. p. 133–42.Google Scholar
- 15.Wang Y, Liu X. Agent based dynamic recovery protocol in distributed databases. Proceedings 2nd international symposium on parallel and distributed computing, Los Alamitos; 2003.Google Scholar