The VLDB Journal

, Volume 27, Issue 3, pp 347–368 | Cite as

Scaling distributed transaction processing and recovery based on dependency logging

  • Chang Yao
  • Meihui Zhang
  • Qian Lin
  • Beng Chin Ooi
  • Jiatao Xu
Regular Paper


Dependency graph-based concurrency control (DGCC) protocol has been shown to achieve good performance on multi-core in-memory system. DGCCseparates contention resolution from the transaction execution and employs dependency graphs to derive serializable transaction schedules. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expensive multi-node distributed transactions. During failure recovery, log must be examined from the last checkpoint onward and the affected transactions are re-executed based on the way they are partitioned and executed. Existing approaches treat both transaction management and recovery as two separate problems, even though recovery is dependent on the sequence in which transactions are executed. In this paper, we propose to treat the transaction management and recovery problems as one. We first propose an efficient distributed dependency graph-based concurrency control (DistDGCC) protocol for handling transactions spanning multiple nodes and propose a new novel and efficient logging protocol called dependency logging that also makes use of dependency graphs for efficient logging and recovery. DistDGCCoptimizes the average cost for each distributed transaction by processing transactions in batches. Moreover, it also reduces the effects of thread blocking caused by distributed transactions and consequently improves the runtime performance. Further, dependency logging exploits the same data structure that is used by DistDGCCto reduce the logging overhead, as well as the logical dependency information to improve the recovery parallelism. Extensive experiments are conducted to evaluate the performance of our proposed technique against state-of-the-art techniques. Experimental results show that DistDGCCis efficient and scalable, and dependency logging supports fast recovery with marginal runtime overhead. Hence, the overall system performance is significantly improved as a result.


Transaction management Concurrency control Recovery Logging Distributed system 



The work in this paper is supported by the National Research Foundation, Prime Ministers Office, Singapore, under its Competitive Research Programme (CRP Award No. NRF-CRP8-2011-08). The work is also in part supported by the Tencent-NUS Collaborative Research Grant. We would like to thank the anonymous reviewers for their insightful feedback that helped improve the paper.


  1. 1.
    Agrawal, R., Jagadish, H.: Recovery algorithms for database machines with non-volatile main memory. In: Database Machines, pp. 269–285 (1989)Google Scholar
  2. 2.
    Agrawal, R., Carey, M.J., Livny, M.: Concurrency control performance modeling: alternatives and implications. TODS 12(4), 609–654 (1987)CrossRefGoogle Scholar
  3. 3.
    Arulraj, J., Perron, M., Pavlo, A.: Write-behind logging. PVLDB 10(4), 337–348 (2016)Google Scholar
  4. 4.
    Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)Google Scholar
  5. 5.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010)Google Scholar
  6. 6.
    DeWitt, D.J., Katz, R.H., Olken, F., Shapiro, L.D., Stonebraker, M.R., Wood, D.A.: Implementation techniques for main memory database systems. SIGMOD 14(2), 1–8 (1984)CrossRefGoogle Scholar
  7. 7.
    Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: SQL server’s memory-optimized OLTP engine. In: SIGMOD, pp. 1243–1254 (2013)Google Scholar
  8. 8.
    Eswaran, K.P., Gray, J.N., Lorie, R.A., Traiger, I.L.: The notions of consistency and predicate locks in a database system. Commun. ACM 19(11), 624–633 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Faleiro, J.M., Abadi, D.J.: Rethinking serializable multiversion concurrency control. PVLDB 8(11), 1190–1201 (2015)Google Scholar
  10. 10.
    Gray, J.: The transaction concept: virtues and limitations. VLDB 81, 144–154 (1981)Google Scholar
  11. 11.
    Hagmann, R.B.: Reimplementing the cedar file system using logging and group commit. In: SOSP, pp. 155–162 (1987)Google Scholar
  12. 12.
    Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: OLTP through the looking glass, and what we found there. In: SIGMOD, pp. 981–992 (2008)Google Scholar
  13. 13.
    Jagadish, H., Silberschatz, A., Sudarshan, S.: Recovering from main-memory lapses. VLDB 93, 391–404 (1993)Google Scholar
  14. 14.
    Jagadish, H.V., Lieuwen, D., Rastogi, R., Silberschatz, A., Sudarshan, S.: Dali: a high performance main memory storage manager. In: VLDB, pp. 48–59 (1994)Google Scholar
  15. 15.
    Johnson, R., Pandis, I., Stoica, R., Athanassoulis, M., Ailamaki, A.: Aether: a scalable approach to logging. PVLDB 3(1–2), 681–692 (2010)Google Scholar
  16. 16.
    Kallman, R., Kimura, H., Natkins, J., Pavlo, A., Rasin, A., Zdonik, S., Jones, E.P., Madden, S., Stonebraker, M., Zhang, Y., et al.: H-store: a high-performance, distributed main memory transaction processing system. VLDB 1(2), 1496–1499 (2008)Google Scholar
  17. 17.
    Kemper, A., Neumann, T.: Hyper: a hybrid OLTP and OLAP main memory database system based on virtual memory snapshots. In: ICDE, pp. 195–206 (2011)Google Scholar
  18. 18.
    Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)CrossRefzbMATHGoogle Scholar
  19. 19.
    Kim, K., Wang, T., Johnson, R., Pandis, I.: Ermia: fast memory-optimized database system for heterogeneous workloads. In: SIGMOD, pp. 1675–1687 (2016)Google Scholar
  20. 20.
    Kimura, H.: Foedus: OLTP engine for a thousand cores and NVRAM. In: SIGMOD, pp. 691–706 (2015)Google Scholar
  21. 21.
    Kimura, H., Graefe, G., Kuno, H.A.: Efficient locking techniques for databases on modern hardware. In: International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, pp. 1–12 (2012)Google Scholar
  22. 22.
    Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. TODS 6(2), 213–226 (1981)CrossRefGoogle Scholar
  23. 23.
    Larson, P.Å., Blanas, S., Diaconu, C., Freedman, C., Patel, J.M., Zwilling, M.: High-performance concurrency control mechanisms for main-memory databases. PVLDB 5(4), 298–309 (2011)Google Scholar
  24. 24.
    Lehman, T.J., Carey, M.J.: A concurrency control algorithm for memory-resident database systems. In: Foundations of Data Organization and Algorithms, pp. 489–504 (1989)Google Scholar
  25. 25.
    Li, X., Eich, M.H.: Post-crash log processing for fuzzy checkpointing main memory databases. In: ICDE, pp. 117–124 (1993)Google Scholar
  26. 26.
    Lomet, D., Tzoumas, K., Zwilling, M.: Implementing performance competitive logical recovery. PVLDB 4(7), 430–439 (2011)Google Scholar
  27. 27.
    Louie, R.H.Y., Li, Y., Vucetic, B.: Practical physical layer network coding for two-way relay channels: performance analysis and comparison. IEEE Trans. Wirel. Commun. 9(2), 764–777 (2010)CrossRefGoogle Scholar
  28. 28.
    Malviya, N., Weisberg, A., Madden, S., Stonebraker, M.: Rethinking main memory OLTP recovery. In: ICDE, pp. 604–615 (2014)Google Scholar
  29. 29.
    Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: Aries: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. TODS 17(1), 94–162 (1992a)CrossRefGoogle Scholar
  30. 30.
    Mohan, C., Pirahesh, H., Lorie, R.: Efficient and flexible methods for transient versioning of records to avoid locking by read-only transactions. SIGMOD 21(2), 124–133 (1992b)CrossRefGoogle Scholar
  31. 31.
    Mu, S., Cui, Y., Zhang, Y., Lloyd, W., Li, J.: Extracting more concurrency from distributed transactions. In: OSDI, pp. 479–494 (2014)Google Scholar
  32. 32.
    Neumann, T., Mühlbauer, T., Kemper, A.: Fast serializable multi-version concurrency control for main-memory database systems. In: SIGMOD, pp. 677–689 (2015)Google Scholar
  33. 33.
    Ongaro, D., Rumble, S.M., Stutsman, R., Ousterhout, J., Rosenblum, M.: Fast crash recovery in ramcloud. In: SOSP, pp. 29–41 (2011)Google Scholar
  34. 34.
    Pelley, S., Wenisch, T.F., Gold, B.T., Bridge, B.: Storage management in the NVRAM era. PVLDB 7(2), 121–132 (2013)Google Scholar
  35. 35.
    Pinheiro, E., Weber, W.D., Barroso, L.A.: Failure trends in a large disk drive population. FAST 7, 17–23 (2007)Google Scholar
  36. 36.
    Ren, K., Thomson, A., Abadi, D.J.: Lightweight locking for main memory database systems. PVLDB 6(2), 145–156 (2012)Google Scholar
  37. 37.
    Ren, K., Faleiro, J.M., Abadi, D.J.: Design principles for scaling multi-core OLTP under high contention. In: SIGMOD, pp. 1583–1598 (2016)Google Scholar
  38. 38.
    Roos, F., Lindah, S.: Distribution system component failure rates and repair times—an overview. In: NORDAC, pp. 23–24 (2004)Google Scholar
  39. 39.
    Schroeder, B., Gibson, G.: A large-scale study of failures in high-performance computing systems. TDSC 7(4), 337–350 (2010)Google Scholar
  40. 40.
    Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., et al: C-store: a column-oriented dbms. In: VLDB, pp. 553–564 (2005)Google Scholar
  41. 41.
    Tan, K., Cai, Q., Ooi, B.C., Wong, W., Yao, C., Zhang, H.: In-memory databases: challenges and opportunities from software and hardware perspectives. SIGMOD Rec. 44(2), 35–40 (2015)CrossRefGoogle Scholar
  42. 42.
    Tu, S., Zheng, W., Kohler, E., Liskov, B., Madden, S.: Speedy transactions in multicore in-memory databases. In: SOSP, pp. 18–32 (2013)Google Scholar
  43. 43.
    Wang, T., Johnson, R.: Scalable logging through emerging non-volatile memory. PVLDB 7(10), 865–876 (2014)Google Scholar
  44. 44.
    Wu, Y., Chan, C.Y., Tan, K.L.: Transaction healing: scaling optimistic concurrency control on multicores. In: SIGMOD, pp. 1689–1704 (2016)Google Scholar
  45. 45.
    Yao, C., Agrawal, D., Chen, G., Lin, Q., Ooi, B.C., Wong, W.F., Zhang, M.: Exploiting single-threaded model in multi-core in-memory systems. TKDE 28(10), 2635–2650 (2016a)Google Scholar
  46. 46.
    Yao, C., Agrawal, D., Chen, G., Ooi, B.C., Wu, S.: Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases. In: SIGMOD, pp. 1119–1134 (2016b)Google Scholar
  47. 47.
    Yu, X., Bezerra, G., Pavlo, A., Devadas, S., Stonebraker, M.: Staring into the abyss: an evaluation of concurrency control with one thousand cores. PVLDB 8(3), 209–220 (2014)Google Scholar
  48. 48.
    Yu, X., Pavlo, A., Sanchez, D., Devadas, S.: Tictoc: time traveling optimistic concurrency control. SIGMOD 8, 209–220 (2016)Google Scholar
  49. 49.
    Yuan, Y., Wang, K., Lee, R., Ding, X., Xing, J., Blanas, S., Zhang, X.: Bcc: reducing false aborts in optimistic concurrency control with low cost for in-memory databases. PVLDB 9(6), 504–515 (2016)Google Scholar
  50. 50.
    Zhang, H., Chen, G., Ooi, B.C., Tan, K.L., Zhang, M.: In-memory big data management and processing: a survey. TKDE 27(7), 1920–1948 (2015)Google Scholar
  51. 51.
    Zheng, W., Tu, S., Kohler, E., Liskov, B.: Fast databases with fast durability and recovery through multicore parallelism. In: OSDI, pp. 465–477 (2014)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.National University of SingaporeSingaporeSingapore
  2. 2.Beijing Institute of TechnologyBeijingChina
  3. 3.Tencent Inc.ShenzhenChina

Personalised recommendations