Precise Data Access on Distributed Log-Structured Merge-Tree

  • Tao Zhu
  • Huiqi HuEmail author
  • Weining Qian
  • Aoying Zhou
  • Mengzhan Liu
  • Qiong Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10367)


Log-structured merge tree decomposes a large database into multiple parts: an in-writing part and several read-only ones. It achieves high write throughput as well as low read latency. However, read requests have to go through multiple structures to find the required data. In a distributed database system, different parts of the LSM-tree are stored distributedly. Data access issues extra network communications for a server in the query layer to pull entries from the underlying storage layer. This work proposes the precise data access strategy. A Bloom filter-based structure is designed to test whether an element exists in the in-writing part of the LSM-tree. A lease-based synchronization strategy is used to maintain consistent copies of the Bloom filter on remote query servers. Experiments show that the solution has 6\(\times \) throughput improvement over existing methods.


Data access Distributed system Consistency 



This is work is partially supported by National Hightech R&D Program (863 Program) under grant number 2015AA015307, National Science Foundation of China under grant numbers 61332006, 61432006 and 61672232, and the Youth Science and Technology- “Yang Fan” Program of Shanghai (17YF1427800). The corresponding author is Huiqi Hu.


  1. 1.
  2. 2.
    Bloom, B.: Space/time trade-offs in hash coding with allowable errors. CACM 13, 422–426 (1970)CrossRefzbMATHGoogle Scholar
  3. 3.
    DeWitt, D., Katz, R., et al.: Implementation techniques for main memory database systems. In: SIGMOD, pp. 1–8 (1984)Google Scholar
  4. 4.
    Mohan, C., Haderle, D., et al.: ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. TODS 17, 94–162 (1992)CrossRefGoogle Scholar
  5. 5.
    O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica 33, 351–385 (1996)CrossRefzbMATHGoogle Scholar
  6. 6.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD, pp. 173–182 (1996)Google Scholar
  7. 7.
    Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: SOSP, pp. 29–43 (2003)Google Scholar
  8. 8.
    Chang, F., Dean, J., et al.: Bigtable: a distributed storage system for structured data. In: OSDI, pp. 4:1–4:26 (2008)Google Scholar
  9. 9.
    Peng, D., Dabek, F.: Large-scale incremental processing using distributed transactions and notifications. In: OSDI, pp. 1–15 (2010)Google Scholar
  10. 10.
    Baker, J., Bond, C., et al.: Megastore: providing scalable, highly available storage for interactive services. In: CIDR, pp. 223–234 (2011)Google Scholar
  11. 11.
    Sears, R., Ramakrishnan, R.: bLSM: a general purpose log structured merge tree. In: SIGMOD, pp. 217–228 (2012)Google Scholar
  12. 12.
    Ahmad, M., Kemme, B.: Compaction management in distributed key-value datastores. In: PVLDB, pp. 850–861 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Tao Zhu
    • 1
  • Huiqi Hu
    • 1
    Email author
  • Weining Qian
    • 1
  • Aoying Zhou
    • 1
  • Mengzhan Liu
    • 2
  • Qiong Zhao
    • 2
  1. 1.School of Data Science and EngineeringEast China Normal UniversityShanghaiChina
  2. 2.Bank of CommunicationsShanghaiChina

Personalised recommendations