Skip to main content

Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14147))

Included in the following conference series:

  • 455 Accesses

Abstract

NoSQL databases have gained great popularity recently. Most of them use the Log Structured Merge (LSM) tree which provides fast write throughput and fast lookup of primary keys. Nevertheless, searching by non-key attributes is very slow because the entire LSM-tree must be scanned. To overcome this problem, the secondary index can be used. Typically, all items in the database are equally covered by the secondary index. However, this is not effective in big data stores where some items are queried very often and some never. To solve this problem, adaptive merging has been introduced. The key idea is to create a secondary index adaptively as a side-product of query processing. Consequently, the database is indexed partially depending on the query workload.

The paper considers the adaptive merging of the secondary index in LSM-based stores. In this approach, the secondary index can be initiated at an arbitrary moment. Thereafter, only the requested data are inserted into the secondary index. They are retrieved from the independent immutable files created during the index initialization in a parallel way. The method can work in the dynamic database environment where database modifications interleave with user queries. The experiments show that the proposed approach outperforms traditional methods by about 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. George, L.: HBase: The Definitive Guide, 1st edn. O’Reilly Media, Sebastopol (2011)

    Google Scholar 

  2. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  3. Alsubaiee, S., et al.: Storage management in AsterixDB. Proc. VLDB Endow. 7(10), 841–852 (2014)

    Article  Google Scholar 

  4. Chodorow, K., Dirolf, M.: MongoDB - The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly, Sebastopol (2010)

    Google Scholar 

  5. Chang, F.W., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 4:1–4:26 (2008)

    Google Scholar 

  6. Google: LevelDB. https://github.com/google/leveldb

  7. Cao, Z., Dong, S., Vemuri, S., Du, D.H.C.: Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In: 18th USENIX Conference on File and Storage Technologies, FAST 2020, Santa Clara, CA, USA, 24–27 February 2020, pp. 209–223. USENIX Association (2020)

    Google Scholar 

  8. O’Neil, P.E., Cheng, E., Gawlick, D., O’Neil, E.J.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)

    Article  MATH  Google Scholar 

  9. Qader, M.A., Cheng, S., Hristidis, V.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 551–566. ACM (2018)

    Google Scholar 

  10. Corbett, J.C., et al.: Spanner: Google’s globally distributed database. ACM Trans. Comput. Syst. 31(3), 8:1–8:22 (2013)

    Google Scholar 

  11. Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. Proc. VLDB Endow. 12(5), 531–543 (2019)

    Article  Google Scholar 

  12. D’silva, J.V., Ruiz-Carrillo, R., Yu, C., Ahmad, M.Y., Kemme, B.: Secondary indexing techniques for key-value stores: two rings to rule them all. In: Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017), Venice, Italy, 21–24 March 2017. CEUR Workshop Proceedings, vol. 1810. CEUR-WS.org (2017)

    Google Scholar 

  13. Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: CIDR (2007)

    Google Scholar 

  14. Idreos, S., Kersten, M.L., Manegold, S.: Updating a cracked database. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD 2007, pp. 413–424. ACM, New York (2007)

    Google Scholar 

  15. Graefe, G., Kuno, H.: Self-selecting, self-tuning, incrementally optimized indexes. In: Proceedings of the 13th International Conference on Extending Database Technology, EDBT 2010, pp. 371–381. ACM, New York (2010)

    Google Scholar 

  16. Idreos, S., Manegold, S., Kuno, H.A., Graefe, G.: Merging what’s cracked, cracking what’s merged: adaptive indexing in main-memory column-stores. PVLDB 4(9), 585–597 (2011)

    Google Scholar 

  17. Xue, Z., Qin, X., Zhou, X., Wang, S., Yu, A.: Optimized adaptive hybrid indexing for in-memory column stores. In: Hong, B., Meng, X., Chen, L., Winiwarter, W., Song, W. (eds.) DASFAA 2013. LNCS, vol. 7827, pp. 101–111. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40270-8_9

    Chapter  Google Scholar 

  18. Dayan, N., Athanassoulis, M., Idreos, S.: Optimal bloom filters and adaptive merging for LSM-trees. ACM Trans. Database Syst. 43(4), 16:1–16:48 (2018)

    Google Scholar 

  19. Alvarez, V., Schuhknecht, F.M., Dittrich, J., Richter, S.: Main memory adaptive indexing for multi-core systems. In: Tenth International Workshop on Data Management on New Hardware, DaMoN 2014, Snowbird, UT, USA, 23 June 2014, pp. 3:1–3:10. ACM (2014)

    Google Scholar 

  20. Macyna, W., Kukowski, M.: Adaptive merging on phase change memory. Fundamenta Informaticae 188(2) (2023)

    Google Scholar 

Download references

Acknowledgment

The paper is supported by Wroclaw University of Science and Technology (subvention number: IDUB/8211204601).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wojciech Macyna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Macyna, W., Kukowski, M., Zwarzko, M. (2023). Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14147. Springer, Cham. https://doi.org/10.1007/978-3-031-39821-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39821-6_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39820-9

  • Online ISBN: 978-3-031-39821-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics