Advertisement

An Adaptive Partition-Based Caching Approach for Efficient Range Queries on Key-Value Data

  • Wei Ge
  • Min Chen
  • Chunfeng Yuan
  • Yihua Huang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9932)

Abstract

Range queries are real demands in big data scenarios, such as analytic and time-traveling queries over web archives. Here we design AdaSI, an adaptive partition-based caching approach for efficient range queries on key-value data. AdaSI partitions data into a number of data slices (consecutive data items). Then the AdaSI Hotscore Algorithm is designed to maximize the cache-hit probability under the limitation of cache space. By measuring Dutyrate and Hotscore of data slice, the partitioning precision and adjustment sensitivity are pursued by finer partitioning on hot data, whereas the cold data are partitioned with relatively larger granularity to reduce storage overhead and search cost of queries. Our results show that the AdaSI Hotscore Algorithm could obtain a cache hit rate nearly as high as the record-based cache policies, as well as a significant speedup and space reduction, far outperforming record-based policies.

Keywords

Query optimization Caching policy Adaptive partitioning 

Notes

Acknowledgments

This work is funded by China NSF Grants (61223003,61572250,61362006), Jiangsu Province Science & Technology Research Grant (BE2014131), Guangxi NSF (2014GXNSFBA118288) and Guangxi IBAYT Program (KY2016YB065).

References

  1. 1.
    Canim, M., Mihaila, G.A., Bhattacharjee, et al.: SSD bufferpool extensions for database systems. In: 36th International Conference on Very Large Data Bases, pp. 1435–1446. VLDB Endowment, Singapore (2010)Google Scholar
  2. 2.
    Levandoski, J.J., Larson, P., Stoica, R.: Identifying hot and cold data in main-memory databases. In: 29th IEEE International Conference on Data Engineering (ICDE), pp. 26C–37. IEEE Press, Brisbane (2013)Google Scholar
  3. 3.
    Sfakianakis, G., Patlakas, I., Ntarmos, N., Triantafillou, P.: Interval indexing and querying on key-value cloud stores. In: 29th IEEE International Conference on Data Engineering (ICDE), p. 805–816. IEEE Press, Brisbane (2013)Google Scholar
  4. 4.
    Bentley, J.L.: Solutions to Klee’s rectangle problem, Technical report. Carnegie-Mellon University, Pittsburgh (1977)Google Scholar
  5. 5.
    Wu, S., Jiang, D., Ooi, B.C., Wu, K.L.: Efficient b-tree based indexing for cloud data processing. In: 36th International Conference on Very Large Data Bases, pp. 1207–1218. VLDB Endowment, Singapore (2010)Google Scholar
  6. 6.
    Lu, P., Wu, S., Shou, L., Tan, K.L.: An efficient and compact indexing scheme for large-scale data store. In: the 29th IEEE International Conference on Data Engineering, pp. 326–337. IEEE Press, Brisbane, Australia (2013)Google Scholar
  7. 7.
    Feelifl, H., Kitsuregawa, M.: The simulation evaluation of heat balancing strategies for btree index over parallel shared nothing machines. IEIC Technical report (Institute of Electronics, Information and Communication Engineers), vol. 99, pp. 7–12 (1999)Google Scholar
  8. 8.
    Lee, J.G., Attaluri, G.K., et al.: Joins on encoded and partitioned data. In: 40th International Conference on Very Large Data Bases, pp. 1355–1366. VLDB Endowment, Hangzhou, China (2014)Google Scholar
  9. 9.
    Chen, C., Li, F., Ooi, B.C., Wu, S.: TI: an efficient indexing mechanism for real-time search on tweets. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 649–660. ACM, Athens, Greece (2011)Google Scholar
  10. 10.
    Wu, L., Lin, W., Xiao, X., Xu, Y.: LSII: an indexing structure for exact real-time search on microblogs. In: 29th IEEE International Conference on Data Engineering, pp. 482–493. IEEE Press, Brisbane, Australia (2013)Google Scholar
  11. 11.
    Ungureanu, C., Debnath, B., Rago, S., Aranya, A.: TBF: a memory-efficient replacement policy for flash-based caches. In: 29th IEEE International Conference on Data Engineering Brisbane (ICDE), pp. 1117–1128. IEEE Press, Brisbane (2013)Google Scholar
  12. 12.
    Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Commun. ACM 33(6), 668–676 (1990)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: 1st ACM Symposium on Cloud Computing, pp. 143–154, Santa Clara, CA (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.State Key Laboratory for Novel Software Technology, Nanjing University Collaborative Innovation Center for Novel Software Technology and Industry of Jiangsu ProvinceNanjing UniversityNanjingChina

Personalised recommendations