Skip to main content
Log in

Scaling distributed transaction processing and recovery based on dependency logging

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Dependency graph-based concurrency control (DGCC) protocol has been shown to achieve good performance on multi-core in-memory system. DGCCseparates contention resolution from the transaction execution and employs dependency graphs to derive serializable transaction schedules. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expensive multi-node distributed transactions. During failure recovery, log must be examined from the last checkpoint onward and the affected transactions are re-executed based on the way they are partitioned and executed. Existing approaches treat both transaction management and recovery as two separate problems, even though recovery is dependent on the sequence in which transactions are executed. In this paper, we propose to treat the transaction management and recovery problems as one. We first propose an efficient distributed dependency graph-based concurrency control (DistDGCC) protocol for handling transactions spanning multiple nodes and propose a new novel and efficient logging protocol called dependency logging that also makes use of dependency graphs for efficient logging and recovery. DistDGCCoptimizes the average cost for each distributed transaction by processing transactions in batches. Moreover, it also reduces the effects of thread blocking caused by distributed transactions and consequently improves the runtime performance. Further, dependency logging exploits the same data structure that is used by DistDGCCto reduce the logging overhead, as well as the logical dependency information to improve the recovery parallelism. Extensive experiments are conducted to evaluate the performance of our proposed technique against state-of-the-art techniques. Experimental results show that DistDGCCis efficient and scalable, and dependency logging supports fast recovery with marginal runtime overhead. Hence, the overall system performance is significantly improved as a result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Notes

  1. http://www.anandtech.com/show/10512/price-check-q3-2016-dram-prices-down-over-20-since-early-2015.

  2. http://www.tpc.org/tpcc/.

References

  1. Agrawal, R., Jagadish, H.: Recovery algorithms for database machines with non-volatile main memory. In: Database Machines, pp. 269–285 (1989)

  2. Agrawal, R., Carey, M.J., Livny, M.: Concurrency control performance modeling: alternatives and implications. TODS 12(4), 609–654 (1987)

    Article  Google Scholar 

  3. Arulraj, J., Perron, M., Pavlo, A.: Write-behind logging. PVLDB 10(4), 337–348 (2016)

    Google Scholar 

  4. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)

    Google Scholar 

  5. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010)

  6. DeWitt, D.J., Katz, R.H., Olken, F., Shapiro, L.D., Stonebraker, M.R., Wood, D.A.: Implementation techniques for main memory database systems. SIGMOD 14(2), 1–8 (1984)

    Article  Google Scholar 

  7. Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: SQL server’s memory-optimized OLTP engine. In: SIGMOD, pp. 1243–1254 (2013)

  8. Eswaran, K.P., Gray, J.N., Lorie, R.A., Traiger, I.L.: The notions of consistency and predicate locks in a database system. Commun. ACM 19(11), 624–633 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  9. Faleiro, J.M., Abadi, D.J.: Rethinking serializable multiversion concurrency control. PVLDB 8(11), 1190–1201 (2015)

    Google Scholar 

  10. Gray, J.: The transaction concept: virtues and limitations. VLDB 81, 144–154 (1981)

    Google Scholar 

  11. Hagmann, R.B.: Reimplementing the cedar file system using logging and group commit. In: SOSP, pp. 155–162 (1987)

  12. Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: OLTP through the looking glass, and what we found there. In: SIGMOD, pp. 981–992 (2008)

  13. Jagadish, H., Silberschatz, A., Sudarshan, S.: Recovering from main-memory lapses. VLDB 93, 391–404 (1993)

    Google Scholar 

  14. Jagadish, H.V., Lieuwen, D., Rastogi, R., Silberschatz, A., Sudarshan, S.: Dali: a high performance main memory storage manager. In: VLDB, pp. 48–59 (1994)

  15. Johnson, R., Pandis, I., Stoica, R., Athanassoulis, M., Ailamaki, A.: Aether: a scalable approach to logging. PVLDB 3(1–2), 681–692 (2010)

    Google Scholar 

  16. Kallman, R., Kimura, H., Natkins, J., Pavlo, A., Rasin, A., Zdonik, S., Jones, E.P., Madden, S., Stonebraker, M., Zhang, Y., et al.: H-store: a high-performance, distributed main memory transaction processing system. VLDB 1(2), 1496–1499 (2008)

    Google Scholar 

  17. Kemper, A., Neumann, T.: Hyper: a hybrid OLTP and OLAP main memory database system based on virtual memory snapshots. In: ICDE, pp. 195–206 (2011)

  18. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)

    Article  MATH  Google Scholar 

  19. Kim, K., Wang, T., Johnson, R., Pandis, I.: Ermia: fast memory-optimized database system for heterogeneous workloads. In: SIGMOD, pp. 1675–1687 (2016)

  20. Kimura, H.: Foedus: OLTP engine for a thousand cores and NVRAM. In: SIGMOD, pp. 691–706 (2015)

  21. Kimura, H., Graefe, G., Kuno, H.A.: Efficient locking techniques for databases on modern hardware. In: International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, pp. 1–12 (2012)

  22. Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. TODS 6(2), 213–226 (1981)

    Article  Google Scholar 

  23. Larson, P.Å., Blanas, S., Diaconu, C., Freedman, C., Patel, J.M., Zwilling, M.: High-performance concurrency control mechanisms for main-memory databases. PVLDB 5(4), 298–309 (2011)

    Google Scholar 

  24. Lehman, T.J., Carey, M.J.: A concurrency control algorithm for memory-resident database systems. In: Foundations of Data Organization and Algorithms, pp. 489–504 (1989)

  25. Li, X., Eich, M.H.: Post-crash log processing for fuzzy checkpointing main memory databases. In: ICDE, pp. 117–124 (1993)

  26. Lomet, D., Tzoumas, K., Zwilling, M.: Implementing performance competitive logical recovery. PVLDB 4(7), 430–439 (2011)

    Google Scholar 

  27. Louie, R.H.Y., Li, Y., Vucetic, B.: Practical physical layer network coding for two-way relay channels: performance analysis and comparison. IEEE Trans. Wirel. Commun. 9(2), 764–777 (2010)

    Article  Google Scholar 

  28. Malviya, N., Weisberg, A., Madden, S., Stonebraker, M.: Rethinking main memory OLTP recovery. In: ICDE, pp. 604–615 (2014)

  29. Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: Aries: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. TODS 17(1), 94–162 (1992a)

    Article  Google Scholar 

  30. Mohan, C., Pirahesh, H., Lorie, R.: Efficient and flexible methods for transient versioning of records to avoid locking by read-only transactions. SIGMOD 21(2), 124–133 (1992b)

    Article  Google Scholar 

  31. Mu, S., Cui, Y., Zhang, Y., Lloyd, W., Li, J.: Extracting more concurrency from distributed transactions. In: OSDI, pp. 479–494 (2014)

  32. Neumann, T., Mühlbauer, T., Kemper, A.: Fast serializable multi-version concurrency control for main-memory database systems. In: SIGMOD, pp. 677–689 (2015)

  33. Ongaro, D., Rumble, S.M., Stutsman, R., Ousterhout, J., Rosenblum, M.: Fast crash recovery in ramcloud. In: SOSP, pp. 29–41 (2011)

  34. Pelley, S., Wenisch, T.F., Gold, B.T., Bridge, B.: Storage management in the NVRAM era. PVLDB 7(2), 121–132 (2013)

    Google Scholar 

  35. Pinheiro, E., Weber, W.D., Barroso, L.A.: Failure trends in a large disk drive population. FAST 7, 17–23 (2007)

    Google Scholar 

  36. Ren, K., Thomson, A., Abadi, D.J.: Lightweight locking for main memory database systems. PVLDB 6(2), 145–156 (2012)

    Google Scholar 

  37. Ren, K., Faleiro, J.M., Abadi, D.J.: Design principles for scaling multi-core OLTP under high contention. In: SIGMOD, pp. 1583–1598 (2016)

  38. Roos, F., Lindah, S.: Distribution system component failure rates and repair times—an overview. In: NORDAC, pp. 23–24 (2004)

  39. Schroeder, B., Gibson, G.: A large-scale study of failures in high-performance computing systems. TDSC 7(4), 337–350 (2010)

    Google Scholar 

  40. Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., et al: C-store: a column-oriented dbms. In: VLDB, pp. 553–564 (2005)

  41. Tan, K., Cai, Q., Ooi, B.C., Wong, W., Yao, C., Zhang, H.: In-memory databases: challenges and opportunities from software and hardware perspectives. SIGMOD Rec. 44(2), 35–40 (2015)

    Article  Google Scholar 

  42. Tu, S., Zheng, W., Kohler, E., Liskov, B., Madden, S.: Speedy transactions in multicore in-memory databases. In: SOSP, pp. 18–32 (2013)

  43. Wang, T., Johnson, R.: Scalable logging through emerging non-volatile memory. PVLDB 7(10), 865–876 (2014)

    Google Scholar 

  44. Wu, Y., Chan, C.Y., Tan, K.L.: Transaction healing: scaling optimistic concurrency control on multicores. In: SIGMOD, pp. 1689–1704 (2016)

  45. Yao, C., Agrawal, D., Chen, G., Lin, Q., Ooi, B.C., Wong, W.F., Zhang, M.: Exploiting single-threaded model in multi-core in-memory systems. TKDE 28(10), 2635–2650 (2016a)

    Google Scholar 

  46. Yao, C., Agrawal, D., Chen, G., Ooi, B.C., Wu, S.: Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases. In: SIGMOD, pp. 1119–1134 (2016b)

  47. Yu, X., Bezerra, G., Pavlo, A., Devadas, S., Stonebraker, M.: Staring into the abyss: an evaluation of concurrency control with one thousand cores. PVLDB 8(3), 209–220 (2014)

    Google Scholar 

  48. Yu, X., Pavlo, A., Sanchez, D., Devadas, S.: Tictoc: time traveling optimistic concurrency control. SIGMOD 8, 209–220 (2016)

    Google Scholar 

  49. Yuan, Y., Wang, K., Lee, R., Ding, X., Xing, J., Blanas, S., Zhang, X.: Bcc: reducing false aborts in optimistic concurrency control with low cost for in-memory databases. PVLDB 9(6), 504–515 (2016)

    Google Scholar 

  50. Zhang, H., Chen, G., Ooi, B.C., Tan, K.L., Zhang, M.: In-memory big data management and processing: a survey. TKDE 27(7), 1920–1948 (2015)

    Google Scholar 

  51. Zheng, W., Tu, S., Kohler, E., Liskov, B.: Fast databases with fast durability and recovery through multicore parallelism. In: OSDI, pp. 465–477 (2014)

Download references

Acknowledgements

The work in this paper is supported by the National Research Foundation, Prime Ministers Office, Singapore, under its Competitive Research Programme (CRP Award No. NRF-CRP8-2011-08). The work is also in part supported by the Tencent-NUS Collaborative Research Grant. We would like to thank the anonymous reviewers for their insightful feedback that helped improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meihui Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, C., Zhang, M., Lin, Q. et al. Scaling distributed transaction processing and recovery based on dependency logging. The VLDB Journal 27, 347–368 (2018). https://doi.org/10.1007/s00778-018-0500-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0500-2

Keywords

Navigation