The VLDB Journal

, Volume 22, Issue 1, pp 29–45 | Cite as

RemusDB: transparent high availability for database systems

  • Umar Farooq Minhas
  • Shriram Rajagopalan
  • Brendan Cully
  • Ashraf Aboulnaga
  • Kenneth Salem
  • Andrew Warfield
Special Issue Paper

Abstract

In this paper, we present a technique for building a high-availability (HA) database management system (DBMS). The proposed technique can be applied to any DBMS with little or no customization, and with reasonable performance overhead. Our approach is based on Remus, a commodity HA solution implemented in the virtualization layer, that uses asynchronous virtual machine state replication to provide transparent HA and failover capabilities. We show that while Remus and similar systems can protect a DBMS, database workloads incur a performance overhead of up to 32 % as compared to an unprotected DBMS. We identify the sources of this overhead and develop optimizations that mitigate the problems. We present an experimental evaluation using two popular database systems and industry standard benchmarks showing that for certain workloads, our optimized approach provides fast failover (\(\le \)3 s of downtime) with low performance overhead when compared to an unprotected DBMS. Our approach provides a practical means for existing, deployed database systems to be made more reliable with a minimum of risk, cost, and effort. Furthermore, this paper invites new discussion about whether the complexity of HA is best implemented within the DBMS, or as a service by the infrastructure below it.

Keywords

High availability Fault tolerance  Virtualization  Checkpointing Performance modeling 

References

  1. 1.
    Aboulnaga, A., Salem, K., Soror, A.A., Minhas, U.F., Kokosielis, P., Kamath, S.: Deploying database appliances in the cloud. IEEE Data Eng. Bull. 32(1), 13–20 (2009)Google Scholar
  2. 2.
    Altekar, G., Stoica, I.: ODR: output-deterministic replay for multicore debugging. In: Symposium on Operating Systems Principles (2009)Google Scholar
  3. 3.
    Baker, M., Sullivan, M.: The recovery box: using fast recovery to provide high availability in the UNIX environment. In: USENIX Summer Conference (1992)Google Scholar
  4. 4.
    Barham, P.T., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Symposium on Operating Systems Principles (SOSP) (2003)Google Scholar
  5. 5.
    Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault-tolerance. In: Symposium on Operating Systems Principles (SOSP) (1995)Google Scholar
  6. 6.
    Chen, W.J., Otsuki, M., Descovich, P., Arumuggharaj, S., Kubo, T., Bi, Y.J.: High availability and disaster recovery options for DB2 on Linux, Unix, and Windows. Tech. Rep. IBM Redbook SG24-7363-01, IBM (2009)Google Scholar
  7. 7.
    Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Symposium on Networked Systems Design and Implementation (NSDI) (2005)Google Scholar
  8. 8.
    Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: High availability via asynchronous virtual machine replication. In: Symposium Networked Systems Design and Implementation (NSDI) (2008)Google Scholar
  9. 9.
    Distributed Replicated Block Device (DRBD): http://www.drbd.org/ (2008)
  10. 10.
    Dunlap, G.W., King, S.T., Cinar, S., Basrai, M.A., Chen, P.M.: ReVirt: enabling intrusion analysis through virtual-machine logging and replay. In: Symposium on Operating Systems Design and Implementation (OSDI) (2002)Google Scholar
  11. 11.
    Dunlap, G.W., Lucchetti, D.G., Fetterman, M.A., Chen, P.M.: Execution replay of multiprocessor virtual machines. In: Virtual Execution Environments (VEE) (2008)Google Scholar
  12. 12.
    Gifford, D.K.: Weighted voting for replicated data. In: Symposium on Operating Systems Principles (SOSP) (1979)Google Scholar
  13. 13.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: International Conference on Management of Data (SIGMOD) (1996)Google Scholar
  14. 14.
    Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann, Los Altos (1993)Google Scholar
  15. 15.
    Java TPC-W implementation, PHARM group, University of Wisconsin. http://www.ece.wisc.edu/pharm/tpcw/ (1999)
  16. 16.
    Kemme, B., Alonso, G.: Don’t be lazy, be consistent: Postgres-R, a new way to implement database replication. In: International Conference on Very Large Data Bases (VLDB) (2000)Google Scholar
  17. 17.
    Komo, D.: Microsoft SQL Server 2008 R2 High Availability Technologies White Paper. Microsoft (2010)Google Scholar
  18. 18.
    Lee, D., Wester, B., Veeraraghavan, K., Narayanasamy, S., Chen, P.M., Flinn, J.: Respec: efficient online multiprocessor replayvia speculation and external determinism. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2010)Google Scholar
  19. 19.
    Linux-HA Project: http://www.linux-ha.org/doc/ (1999)
  20. 20.
    Llanos, D.R.: TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems. SIGMOD Rec. 35(4), 6–15 (2006)Google Scholar
  21. 21.
    Minhas, U.F., Rajagopalan, S., Cully, B., Aboulnaga, A., Salem, K., Warfield, A.: RemusDB: Transparent high availability for database systems. Proc. VLDB Endow. (PVLDB) 4(11), 738–748 (2011)Google Scholar
  22. 22.
    Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. Trans. Database Syst. (TODS) 17(1), 94–162 (1992)Google Scholar
  23. 23.
    MySQL Cluster 7.0 and 7.1: Architecture and New Features. A MySQL Technical White Paper by Oracle (2010) Google Scholar
  24. 24.
    Oracle: Oracle Data Guard Concepts and Administration, 11g Release 1 edn (2008)Google Scholar
  25. 25.
    Oracle: Oracle Real Application Clusters 11g Release 2. Oracle (2009)Google Scholar
  26. 26.
    Oracle: MySQL 5.0 Reference Manual. Revision 23486, http://dev.mysql.com/doc/refman/5.0/en/ (2010)
  27. 27.
    Percona Tools TPC-C MySQL Benchmark: https://code.launchpad.net/percona-dev/perconatools/tpcc-mysql (2008)
  28. 28.
    Polyzois, C.A., Garcia-Molina, H.: Evaluation of remote backup algorithms for transaction processing systems. In: International Conference on Management of Data (SIGMOD) (1992)Google Scholar
  29. 29.
    Rajagopalan, S., Cully, B., O’Connor, R., Warfield, A.: SecondSite: disaster tolerance as a service. In: Virtual Execution Environments (VEE) (2012)Google Scholar
  30. 30.
    Scales, D.J., Nelson, M., Venkitachalam, G.: The design and evaluation of a practical system for fault-tolerant virtual machines. Tech. Rep. VMWare-RT-2010-001, VMWare (2010)Google Scholar
  31. 31.
    Soror, A.A., Minhas, U.F., Aboulnaga, A., Salem, K., Kokosielis, P., Kamath, S.: Automatic virtual machine configuration for database workloads. Trans. Database Syst. (TODS) 35(1), 1–47 (2010)Google Scholar
  32. 32.
    Strom, R., Yemini, S.: Optimistic recovery in distributed systems. Trans. Comput. Syst. (TOCS) 3(3), 204–226 (1985)Google Scholar
  33. 33.
    TCP/IP Tutorial and Technical Overview: http://www.redbooks.ibm.com/redbooks/pdfs/gg243376.pdf (2006)
  34. 34.
    Thomas, R.H.: A majority consensus approach to concurrency control for multiple copy databases. Trans. Database Syst. (TODS) 4(2) (1979)Google Scholar
  35. 35.
    The TPC-C Benchmark: http://www.tpc.org/tpcc/ (1992)
  36. 36.
    The TPC-H Benchmark: http://www.tpc.org/tpch/ (1999)
  37. 37.
    The TPC-W Benchmark: http://www.tpc.org/tpcw/ (1999)
  38. 38.
  39. 39.
    Xu, M., Bodik, R., Hill, M.D.: A “flight data recorder” for enabling full-system multiprocessor deterministic replay. Comput. Archit. News 31(2), 122–135 (2003)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Umar Farooq Minhas
    • 1
  • Shriram Rajagopalan
    • 2
  • Brendan Cully
    • 2
  • Ashraf Aboulnaga
    • 1
  • Kenneth Salem
    • 1
  • Andrew Warfield
    • 2
  1. 1.University of WaterlooWaterlooCanada
  2. 2.University of British ColumbiaVancouverCanada

Personalised recommendations