Scaling distributed transaction processing and recovery based on dependency logging

Yao, Chang; Zhang, Meihui; Lin, Qian; Ooi, Beng Chin; Xu, Jiatao

doi:10.1007/s00778-018-0500-2

Scaling distributed transaction processing and recovery based on dependency logging

Regular Paper
Published: 22 March 2018

Volume 27, pages 347–368, (2018)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Chang Yao¹,
Meihui Zhang ORCID: orcid.org/0000-0002-0752-9877²,
Qian Lin¹,
Beng Chin Ooi¹ &
…
Jiatao Xu³

613 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

Dependency graph-based concurrency control (DGCC) protocol has been shown to achieve good performance on multi-core in-memory system. DGCCseparates contention resolution from the transaction execution and employs dependency graphs to derive serializable transaction schedules. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expensive multi-node distributed transactions. During failure recovery, log must be examined from the last checkpoint onward and the affected transactions are re-executed based on the way they are partitioned and executed. Existing approaches treat both transaction management and recovery as two separate problems, even though recovery is dependent on the sequence in which transactions are executed. In this paper, we propose to treat the transaction management and recovery problems as one. We first propose an efficient distributed dependency graph-based concurrency control (DistDGCC) protocol for handling transactions spanning multiple nodes and propose a new novel and efficient logging protocol called dependency logging that also makes use of dependency graphs for efficient logging and recovery. DistDGCCoptimizes the average cost for each distributed transaction by processing transactions in batches. Moreover, it also reduces the effects of thread blocking caused by distributed transactions and consequently improves the runtime performance. Further, dependency logging exploits the same data structure that is used by DistDGCCto reduce the logging overhead, as well as the logical dependency information to improve the recovery parallelism. Extensive experiments are conducted to evaluate the performance of our proposed technique against state-of-the-art techniques. Experimental results show that DistDGCCis efficient and scalable, and dependency logging supports fast recovery with marginal runtime overhead. Hence, the overall system performance is significantly improved as a result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Twin-Buffer Scheme for High-Throughput Logging

Plover: Parallel In-Memory Database Logging on Scalable Storage Devices

A Distributed Run-Time Dynamic Data Manager for Multi-core System Parallel Execution

Notes

References

Agrawal, R., Jagadish, H.: Recovery algorithms for database machines with non-volatile main memory. In: Database Machines, pp. 269–285 (1989)
Agrawal, R., Carey, M.J., Livny, M.: Concurrency control performance modeling: alternatives and implications. TODS 12(4), 609–654 (1987)
Article Google Scholar
Arulraj, J., Perron, M., Pavlo, A.: Write-behind logging. PVLDB 10(4), 337–348 (2016)
Google Scholar
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley, Reading (1987)
Google Scholar
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010)
DeWitt, D.J., Katz, R.H., Olken, F., Shapiro, L.D., Stonebraker, M.R., Wood, D.A.: Implementation techniques for main memory database systems. SIGMOD 14(2), 1–8 (1984)
Article Google Scholar
Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: SQL server’s memory-optimized OLTP engine. In: SIGMOD, pp. 1243–1254 (2013)
Eswaran, K.P., Gray, J.N., Lorie, R.A., Traiger, I.L.: The notions of consistency and predicate locks in a database system. Commun. ACM 19(11), 624–633 (1976)
Article MathSciNet MATH Google Scholar
Faleiro, J.M., Abadi, D.J.: Rethinking serializable multiversion concurrency control. PVLDB 8(11), 1190–1201 (2015)
Google Scholar
Gray, J.: The transaction concept: virtues and limitations. VLDB 81, 144–154 (1981)
Google Scholar
Hagmann, R.B.: Reimplementing the cedar file system using logging and group commit. In: SOSP, pp. 155–162 (1987)
Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: OLTP through the looking glass, and what we found there. In: SIGMOD, pp. 981–992 (2008)
Jagadish, H., Silberschatz, A., Sudarshan, S.: Recovering from main-memory lapses. VLDB 93, 391–404 (1993)
Google Scholar
Jagadish, H.V., Lieuwen, D., Rastogi, R., Silberschatz, A., Sudarshan, S.: Dali: a high performance main memory storage manager. In: VLDB, pp. 48–59 (1994)
Johnson, R., Pandis, I., Stoica, R., Athanassoulis, M., Ailamaki, A.: Aether: a scalable approach to logging. PVLDB 3(1–2), 681–692 (2010)
Google Scholar
Kallman, R., Kimura, H., Natkins, J., Pavlo, A., Rasin, A., Zdonik, S., Jones, E.P., Madden, S., Stonebraker, M., Zhang, Y., et al.: H-store: a high-performance, distributed main memory transaction processing system. VLDB 1(2), 1496–1499 (2008)
Google Scholar
Kemper, A., Neumann, T.: Hyper: a hybrid OLTP and OLAP main memory database system based on virtual memory snapshots. In: ICDE, pp. 195–206 (2011)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970)
Article MATH Google Scholar
Kim, K., Wang, T., Johnson, R., Pandis, I.: Ermia: fast memory-optimized database system for heterogeneous workloads. In: SIGMOD, pp. 1675–1687 (2016)
Kimura, H.: Foedus: OLTP engine for a thousand cores and NVRAM. In: SIGMOD, pp. 691–706 (2015)
Kimura, H., Graefe, G., Kuno, H.A.: Efficient locking techniques for databases on modern hardware. In: International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, pp. 1–12 (2012)
Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. TODS 6(2), 213–226 (1981)
Article Google Scholar
Larson, P.Å., Blanas, S., Diaconu, C., Freedman, C., Patel, J.M., Zwilling, M.: High-performance concurrency control mechanisms for main-memory databases. PVLDB 5(4), 298–309 (2011)
Google Scholar
Lehman, T.J., Carey, M.J.: A concurrency control algorithm for memory-resident database systems. In: Foundations of Data Organization and Algorithms, pp. 489–504 (1989)
Li, X., Eich, M.H.: Post-crash log processing for fuzzy checkpointing main memory databases. In: ICDE, pp. 117–124 (1993)
Lomet, D., Tzoumas, K., Zwilling, M.: Implementing performance competitive logical recovery. PVLDB 4(7), 430–439 (2011)
Google Scholar
Louie, R.H.Y., Li, Y., Vucetic, B.: Practical physical layer network coding for two-way relay channels: performance analysis and comparison. IEEE Trans. Wirel. Commun. 9(2), 764–777 (2010)
Article Google Scholar
Malviya, N., Weisberg, A., Madden, S., Stonebraker, M.: Rethinking main memory OLTP recovery. In: ICDE, pp. 604–615 (2014)
Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: Aries: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. TODS 17(1), 94–162 (1992a)
Article Google Scholar
Mohan, C., Pirahesh, H., Lorie, R.: Efficient and flexible methods for transient versioning of records to avoid locking by read-only transactions. SIGMOD 21(2), 124–133 (1992b)
Article Google Scholar
Mu, S., Cui, Y., Zhang, Y., Lloyd, W., Li, J.: Extracting more concurrency from distributed transactions. In: OSDI, pp. 479–494 (2014)
Neumann, T., Mühlbauer, T., Kemper, A.: Fast serializable multi-version concurrency control for main-memory database systems. In: SIGMOD, pp. 677–689 (2015)
Ongaro, D., Rumble, S.M., Stutsman, R., Ousterhout, J., Rosenblum, M.: Fast crash recovery in ramcloud. In: SOSP, pp. 29–41 (2011)
Pelley, S., Wenisch, T.F., Gold, B.T., Bridge, B.: Storage management in the NVRAM era. PVLDB 7(2), 121–132 (2013)
Google Scholar
Pinheiro, E., Weber, W.D., Barroso, L.A.: Failure trends in a large disk drive population. FAST 7, 17–23 (2007)
Google Scholar
Ren, K., Thomson, A., Abadi, D.J.: Lightweight locking for main memory database systems. PVLDB 6(2), 145–156 (2012)
Google Scholar
Ren, K., Faleiro, J.M., Abadi, D.J.: Design principles for scaling multi-core OLTP under high contention. In: SIGMOD, pp. 1583–1598 (2016)
Roos, F., Lindah, S.: Distribution system component failure rates and repair times—an overview. In: NORDAC, pp. 23–24 (2004)
Schroeder, B., Gibson, G.: A large-scale study of failures in high-performance computing systems. TDSC 7(4), 337–350 (2010)
Google Scholar
Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., et al: C-store: a column-oriented dbms. In: VLDB, pp. 553–564 (2005)
Tan, K., Cai, Q., Ooi, B.C., Wong, W., Yao, C., Zhang, H.: In-memory databases: challenges and opportunities from software and hardware perspectives. SIGMOD Rec. 44(2), 35–40 (2015)
Article Google Scholar
Tu, S., Zheng, W., Kohler, E., Liskov, B., Madden, S.: Speedy transactions in multicore in-memory databases. In: SOSP, pp. 18–32 (2013)
Wang, T., Johnson, R.: Scalable logging through emerging non-volatile memory. PVLDB 7(10), 865–876 (2014)
Google Scholar
Wu, Y., Chan, C.Y., Tan, K.L.: Transaction healing: scaling optimistic concurrency control on multicores. In: SIGMOD, pp. 1689–1704 (2016)
Yao, C., Agrawal, D., Chen, G., Lin, Q., Ooi, B.C., Wong, W.F., Zhang, M.: Exploiting single-threaded model in multi-core in-memory systems. TKDE 28(10), 2635–2650 (2016a)
Google Scholar
Yao, C., Agrawal, D., Chen, G., Ooi, B.C., Wu, S.: Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases. In: SIGMOD, pp. 1119–1134 (2016b)
Yu, X., Bezerra, G., Pavlo, A., Devadas, S., Stonebraker, M.: Staring into the abyss: an evaluation of concurrency control with one thousand cores. PVLDB 8(3), 209–220 (2014)
Google Scholar
Yu, X., Pavlo, A., Sanchez, D., Devadas, S.: Tictoc: time traveling optimistic concurrency control. SIGMOD 8, 209–220 (2016)
Google Scholar
Yuan, Y., Wang, K., Lee, R., Ding, X., Xing, J., Blanas, S., Zhang, X.: Bcc: reducing false aborts in optimistic concurrency control with low cost for in-memory databases. PVLDB 9(6), 504–515 (2016)
Google Scholar
Zhang, H., Chen, G., Ooi, B.C., Tan, K.L., Zhang, M.: In-memory big data management and processing: a survey. TKDE 27(7), 1920–1948 (2015)
Google Scholar
Zheng, W., Tu, S., Kohler, E., Liskov, B.: Fast databases with fast durability and recovery through multicore parallelism. In: OSDI, pp. 465–477 (2014)

Download references

Acknowledgements

The work in this paper is supported by the National Research Foundation, Prime Ministers Office, Singapore, under its Competitive Research Programme (CRP Award No. NRF-CRP8-2011-08). The work is also in part supported by the Tencent-NUS Collaborative Research Grant. We would like to thank the anonymous reviewers for their insightful feedback that helped improve the paper.

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Chang Yao, Qian Lin & Beng Chin Ooi
Beijing Institute of Technology, Beijing, China
Meihui Zhang
Tencent Inc., Shenzhen, China
Jiatao Xu

Authors

Chang Yao
View author publications
You can also search for this author in PubMed Google Scholar
Meihui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Lin
View author publications
You can also search for this author in PubMed Google Scholar
Beng Chin Ooi
View author publications
You can also search for this author in PubMed Google Scholar
Jiatao Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meihui Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, C., Zhang, M., Lin, Q. et al. Scaling distributed transaction processing and recovery based on dependency logging. The VLDB Journal 27, 347–368 (2018). https://doi.org/10.1007/s00778-018-0500-2

Download citation

Received: 29 June 2017
Revised: 26 February 2018
Accepted: 10 March 2018
Published: 22 March 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s00778-018-0500-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling distributed transaction processing and recovery based on dependency logging

Abstract

Access this article

Similar content being viewed by others

A Twin-Buffer Scheme for High-Throughput Logging

Plover: Parallel In-Memory Database Logging on Scalable Storage Devices

A Distributed Run-Time Dynamic Data Manager for Multi-core System Parallel Execution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

A Twin-Buffer Scheme for High-Throughput Logging

Plover: Parallel In-Memory Database Logging on Scalable Storage Devices

A Distributed Run-Time Dynamic Data Manager for Multi-core System Parallel Execution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation