Skip to main content
Log in

RemusDB: transparent high availability for database systems

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

In this paper, we present a technique for building a high-availability (HA) database management system (DBMS). The proposed technique can be applied to any DBMS with little or no customization, and with reasonable performance overhead. Our approach is based on Remus, a commodity HA solution implemented in the virtualization layer, that uses asynchronous virtual machine state replication to provide transparent HA and failover capabilities. We show that while Remus and similar systems can protect a DBMS, database workloads incur a performance overhead of up to 32 % as compared to an unprotected DBMS. We identify the sources of this overhead and develop optimizations that mitigate the problems. We present an experimental evaluation using two popular database systems and industry standard benchmarks showing that for certain workloads, our optimized approach provides fast failover (\(\le \)3 s of downtime) with low performance overhead when compared to an unprotected DBMS. Our approach provides a practical means for existing, deployed database systems to be made more reliable with a minimum of risk, cost, and effort. Furthermore, this paper invites new discussion about whether the complexity of HA is best implemented within the DBMS, or as a service by the infrastructure below it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Aboulnaga, A., Salem, K., Soror, A.A., Minhas, U.F., Kokosielis, P., Kamath, S.: Deploying database appliances in the cloud. IEEE Data Eng. Bull. 32(1), 13–20 (2009)

    Google Scholar 

  2. Altekar, G., Stoica, I.: ODR: output-deterministic replay for multicore debugging. In: Symposium on Operating Systems Principles (2009)

  3. Baker, M., Sullivan, M.: The recovery box: using fast recovery to provide high availability in the UNIX environment. In: USENIX Summer Conference (1992)

  4. Barham, P.T., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Symposium on Operating Systems Principles (SOSP) (2003)

  5. Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault-tolerance. In: Symposium on Operating Systems Principles (SOSP) (1995)

  6. Chen, W.J., Otsuki, M., Descovich, P., Arumuggharaj, S., Kubo, T., Bi, Y.J.: High availability and disaster recovery options for DB2 on Linux, Unix, and Windows. Tech. Rep. IBM Redbook SG24-7363-01, IBM (2009)

  7. Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Symposium on Networked Systems Design and Implementation (NSDI) (2005)

  8. Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: High availability via asynchronous virtual machine replication. In: Symposium Networked Systems Design and Implementation (NSDI) (2008)

  9. Distributed Replicated Block Device (DRBD): http://www.drbd.org/ (2008)

  10. Dunlap, G.W., King, S.T., Cinar, S., Basrai, M.A., Chen, P.M.: ReVirt: enabling intrusion analysis through virtual-machine logging and replay. In: Symposium on Operating Systems Design and Implementation (OSDI) (2002)

  11. Dunlap, G.W., Lucchetti, D.G., Fetterman, M.A., Chen, P.M.: Execution replay of multiprocessor virtual machines. In: Virtual Execution Environments (VEE) (2008)

  12. Gifford, D.K.: Weighted voting for replicated data. In: Symposium on Operating Systems Principles (SOSP) (1979)

  13. Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: International Conference on Management of Data (SIGMOD) (1996)

  14. Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann, Los Altos (1993)

  15. Java TPC-W implementation, PHARM group, University of Wisconsin. http://www.ece.wisc.edu/pharm/tpcw/ (1999)

  16. Kemme, B., Alonso, G.: Don’t be lazy, be consistent: Postgres-R, a new way to implement database replication. In: International Conference on Very Large Data Bases (VLDB) (2000)

  17. Komo, D.: Microsoft SQL Server 2008 R2 High Availability Technologies White Paper. Microsoft (2010)

  18. Lee, D., Wester, B., Veeraraghavan, K., Narayanasamy, S., Chen, P.M., Flinn, J.: Respec: efficient online multiprocessor replayvia speculation and external determinism. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2010)

  19. Linux-HA Project: http://www.linux-ha.org/doc/ (1999)

  20. Llanos, D.R.: TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems. SIGMOD Rec. 35(4), 6–15 (2006)

    Google Scholar 

  21. Minhas, U.F., Rajagopalan, S., Cully, B., Aboulnaga, A., Salem, K., Warfield, A.: RemusDB: Transparent high availability for database systems. Proc. VLDB Endow. (PVLDB) 4(11), 738–748 (2011)

  22. Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. Trans. Database Syst. (TODS) 17(1), 94–162 (1992)

    Google Scholar 

  23. MySQL Cluster 7.0 and 7.1: Architecture and New Features. A MySQL Technical White Paper by Oracle (2010)

  24. Oracle: Oracle Data Guard Concepts and Administration, 11g Release 1 edn (2008)

  25. Oracle: Oracle Real Application Clusters 11g Release 2. Oracle (2009)

  26. Oracle: MySQL 5.0 Reference Manual. Revision 23486, http://dev.mysql.com/doc/refman/5.0/en/ (2010)

  27. Percona Tools TPC-C MySQL Benchmark: https://code.launchpad.net/percona-dev/perconatools/tpcc-mysql (2008)

  28. Polyzois, C.A., Garcia-Molina, H.: Evaluation of remote backup algorithms for transaction processing systems. In: International Conference on Management of Data (SIGMOD) (1992)

  29. Rajagopalan, S., Cully, B., O’Connor, R., Warfield, A.: SecondSite: disaster tolerance as a service. In: Virtual Execution Environments (VEE) (2012)

  30. Scales, D.J., Nelson, M., Venkitachalam, G.: The design and evaluation of a practical system for fault-tolerant virtual machines. Tech. Rep. VMWare-RT-2010-001, VMWare (2010)

  31. Soror, A.A., Minhas, U.F., Aboulnaga, A., Salem, K., Kokosielis, P., Kamath, S.: Automatic virtual machine configuration for database workloads. Trans. Database Syst. (TODS) 35(1), 1–47 (2010)

    Google Scholar 

  32. Strom, R., Yemini, S.: Optimistic recovery in distributed systems. Trans. Comput. Syst. (TOCS) 3(3), 204–226 (1985)

    Google Scholar 

  33. TCP/IP Tutorial and Technical Overview: http://www.redbooks.ibm.com/redbooks/pdfs/gg243376.pdf (2006)

  34. Thomas, R.H.: A majority consensus approach to concurrency control for multiple copy databases. Trans. Database Syst. (TODS) 4(2) (1979)

  35. The TPC-C Benchmark: http://www.tpc.org/tpcc/ (1992)

  36. The TPC-H Benchmark: http://www.tpc.org/tpch/ (1999)

  37. The TPC-W Benchmark: http://www.tpc.org/tpcw/ (1999)

  38. Xen Blktap2 Driver: http://wiki.xensource.com/xenwiki/blktap2 (2010)

  39. Xu, M., Bodik, R., Hill, M.D.: A “flight data recorder” for enabling full-system multiprocessor deterministic replay. Comput. Archit. News 31(2), 122–135 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Umar Farooq Minhas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Minhas, U.F., Rajagopalan, S., Cully, B. et al. RemusDB: transparent high availability for database systems. The VLDB Journal 22, 29–45 (2013). https://doi.org/10.1007/s00778-012-0294-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-012-0294-6

Keywords

Navigation