Precise Data Access on Distributed Log-Structured Merge-Tree
Log-structured merge tree decomposes a large database into multiple parts: an in-writing part and several read-only ones. It achieves high write throughput as well as low read latency. However, read requests have to go through multiple structures to find the required data. In a distributed database system, different parts of the LSM-tree are stored distributedly. Data access issues extra network communications for a server in the query layer to pull entries from the underlying storage layer. This work proposes the precise data access strategy. A Bloom filter-based structure is designed to test whether an element exists in the in-writing part of the LSM-tree. A lease-based synchronization strategy is used to maintain consistent copies of the Bloom filter on remote query servers. Experiments show that the solution has 6\(\times \) throughput improvement over existing methods.
KeywordsData access Distributed system Consistency
This is work is partially supported by National Hightech R&D Program (863 Program) under grant number 2015AA015307, National Science Foundation of China under grant numbers 61332006, 61432006 and 61672232, and the Youth Science and Technology- “Yang Fan” Program of Shanghai (17YF1427800). The corresponding author is Huiqi Hu.
- 3.DeWitt, D., Katz, R., et al.: Implementation techniques for main memory database systems. In: SIGMOD, pp. 1–8 (1984)Google Scholar
- 6.Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. In: SIGMOD, pp. 173–182 (1996)Google Scholar
- 7.Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: SOSP, pp. 29–43 (2003)Google Scholar
- 8.Chang, F., Dean, J., et al.: Bigtable: a distributed storage system for structured data. In: OSDI, pp. 4:1–4:26 (2008)Google Scholar
- 9.Peng, D., Dabek, F.: Large-scale incremental processing using distributed transactions and notifications. In: OSDI, pp. 1–15 (2010)Google Scholar
- 10.Baker, J., Bond, C., et al.: Megastore: providing scalable, highly available storage for interactive services. In: CIDR, pp. 223–234 (2011)Google Scholar
- 11.Sears, R., Ramakrishnan, R.: bLSM: a general purpose log structured merge tree. In: SIGMOD, pp. 217–228 (2012)Google Scholar
- 12.Ahmad, M., Kemme, B.: Compaction management in distributed key-value datastores. In: PVLDB, pp. 850–861 (2015)Google Scholar