Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores

Macyna, Wojciech; Kukowski, Michal; Zwarzko, Michal

doi:10.1007/978-3-031-39821-6_20

Wojciech Macyna¹²,
Michal Kukowski¹² &
Michal Zwarzko¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14147))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

455 Accesses

Abstract

NoSQL databases have gained great popularity recently. Most of them use the Log Structured Merge (LSM) tree which provides fast write throughput and fast lookup of primary keys. Nevertheless, searching by non-key attributes is very slow because the entire LSM-tree must be scanned. To overcome this problem, the secondary index can be used. Typically, all items in the database are equally covered by the secondary index. However, this is not effective in big data stores where some items are queried very often and some never. To solve this problem, adaptive merging has been introduced. The key idea is to create a secondary index adaptively as a side-product of query processing. Consequently, the database is indexed partially depending on the query workload.

The paper considers the adaptive merging of the secondary index in LSM-based stores. In this approach, the secondary index can be initiated at an arbitrary moment. Thereafter, only the requested data are inserted into the secondary index. They are retrieved from the independent immutable files created during the index initialization in a parallel way. The method can work in the dynamic database environment where database modifications interleave with user queries. The experiments show that the proposed approach outperforms traditional methods by about 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

George, L.: HBase: The Definitive Guide, 1st edn. O’Reilly Media, Sebastopol (2011)
Google Scholar
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Article Google Scholar
Alsubaiee, S., et al.: Storage management in AsterixDB. Proc. VLDB Endow. 7(10), 841–852 (2014)
Article Google Scholar
Chodorow, K., Dirolf, M.: MongoDB - The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly, Sebastopol (2010)
Google Scholar
Chang, F.W., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 4:1–4:26 (2008)
Google Scholar
Google: LevelDB. https://github.com/google/leveldb
Cao, Z., Dong, S., Vemuri, S., Du, D.H.C.: Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In: 18th USENIX Conference on File and Storage Technologies, FAST 2020, Santa Clara, CA, USA, 24–27 February 2020, pp. 209–223. USENIX Association (2020)
Google Scholar
O’Neil, P.E., Cheng, E., Gawlick, D., O’Neil, E.J.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)
Article MATH Google Scholar
Qader, M.A., Cheng, S., Hristidis, V.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 551–566. ACM (2018)
Google Scholar
Corbett, J.C., et al.: Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst. 31(3), 8:1–8:22 (2013)
Google Scholar
Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. Proc. VLDB Endow. 12(5), 531–543 (2019)
Article Google Scholar
D’silva, J.V., Ruiz-Carrillo, R., Yu, C., Ahmad, M.Y., Kemme, B.: Secondary indexing techniques for key-value stores: two rings to rule them all. In: Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017), Venice, Italy, 21–24 March 2017. CEUR Workshop Proceedings, vol. 1810. CEUR-WS.org (2017)
Google Scholar
Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: CIDR (2007)
Google Scholar
Idreos, S., Kersten, M.L., Manegold, S.: Updating a cracked database. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD 2007, pp. 413–424. ACM, New York (2007)
Google Scholar
Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: Proceedings of the 13th International Conference on Extending Database Technology, EDBT 2010, pp. 371–381. ACM, New York (2010)
Google Scholar
Idreos, S., Manegold, S., Kuno, H.A., Graefe, G.: Merging what’s cracked, cracking what’s merged: adaptive indexing in main-memory column-stores. PVLDB 4(9), 585–597 (2011)
Google Scholar
Xue, Z., Qin, X., Zhou, X., Wang, S., Yu, A.: Optimized adaptive hybrid indexing for in-memory column stores. In: Hong, B., Meng, X., Chen, L., Winiwarter, W., Song, W. (eds.) DASFAA 2013. LNCS, vol. 7827, pp. 101–111. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40270-8_9
Chapter Google Scholar
Dayan, N., Athanassoulis, M., Idreos, S.: Optimal bloom filters and adaptive merging for LSM-trees. ACM Trans. Database Syst. 43(4), 16:1–16:48 (2018)
Google Scholar
Alvarez, V., Schuhknecht, F.M., Dittrich, J., Richter, S.: Main memory adaptive indexing for multi-core systems. In: Tenth International Workshop on Data Management on New Hardware, DaMoN 2014, Snowbird, UT, USA, 23 June 2014, pp. 3:1–3:10. ACM (2014)
Google Scholar
Macyna, W., Kukowski, M.: Adaptive merging on phase change memory. Fundamenta Informaticae 188(2) (2023)
Google Scholar

Download references

Acknowledgment

The paper is supported by Wroclaw University of Science and Technology (subvention number: IDUB/8211204601).

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Information and Communication Technology, Wrocław University of Science and Technology, Wrocław, Poland
Wojciech Macyna, Michal Kukowski & Michal Zwarzko

Authors

Wojciech Macyna
View author publications
You can also search for this author in PubMed Google Scholar
Michal Kukowski
View author publications
You can also search for this author in PubMed Google Scholar
Michal Zwarzko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojciech Macyna .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
University of Tsukuba, Ibaraki, Japan
Toshiyuki Amagasa
Johannes Kepler University Linz, Linz, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Macyna, W., Kukowski, M., Zwarzko, M. (2023). Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14147. Springer, Cham. https://doi.org/10.1007/978-3-031-39821-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-39821-6_20
Published: 16 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39820-9
Online ISBN: 978-3-031-39821-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores