Query Optimization on Hybrid Storage

  • Anxuan Yu
  • Qingzhong Meng
  • Xuan Zhou
  • Binyu Shen
  • Yansong Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10177)

Abstract

Thanks to the rapid growth of memory capacity, it is now feasible to perform query processing completely in memory. Nevertheless, as main memory is substantially more expensive than most secondary storage equipments, including HDD and SSD, it is not suitable for storing cold data. Therefore, a hybrid data storage composed of both memory and secondary storage is expected to stay popular in the foreseeable future. In this paper, we introduce a query optimization model for hybrid data storage. Different from traditional query processors, which treat either main memory as a cache or secondary storage as an anti-cache, our model performs semantic data partitioning between memory and secondary storage. Query optimization can thus take the partitioning of data into account, to achieve enhanced performance. We conducted experimental evaluation on a columnar query engine to demonstrate the advantage of the proposed approach.

Notes

Acknowledgement

This work is partially supported by Chinese National High-tech R&D Program (863 Program) (2015AA015307) and the NSFC Porject (No. 61272138).

References

  1. 1.
    Akbar, M.M., Rahman, M.S., Kaykobad, M., Manning, E.G., Shoja, G.C.: Solving the multidimensional multiple-choice knapsack problem by constructing convex hulls. Comput. Oper. Res. 33(5), 1259–1273 (2006)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Bernstein, P.A., Goodman, N., Wong, E., Reeve, C.L., Rothnie Jr., J.B.: Query processing in a system for distributed databases (sdd-1). ACM TODS 6(4), 602–625 (1981)CrossRefMATHGoogle Scholar
  3. 3.
    Boncz, P.A., Zukowski, M., Nes, N.: Monetdb, x100: hyper-pipelining query execution. In: CIDR, pp. 225–237 (2005)Google Scholar
  4. 4.
    Ceri, S., Gottlob, G.: Optimizing joins between two partitioned relations in distributed databases. J. Parallel Distrib. Comput. 3(2), 183–205 (1986)CrossRefGoogle Scholar
  5. 5.
    Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 34–43. ACM (1998)Google Scholar
  6. 6.
    Dar, S., Franklin, M.J., Jonsson, B.T., Srivastava, D., Tan, M., et al.: Semantic data caching and replacement. In: Proceedings of VLDB, vol. 96, pp. 330–341. Citeseer (1996)Google Scholar
  7. 7.
    DeBrabant, J., Pavlo, A., Tu, S., Stonebraker, M., Zdonik, S.: Anti-caching: a new approach to database management system architecture. Proc. VLDB Endow. 6(14), 1942–1953 (2013)CrossRefGoogle Scholar
  8. 8.
    Eldawy, A., Levandoski, J., Larson, P.-Å.: Trekking through siberia: managing cold data in a memory-optimized database. Proc. VLDB Endow. 7(11), 931–942 (2014)CrossRefGoogle Scholar
  9. 9.
    Finkelstein, S.: Common expression analysis in database applications. In: Proceedings of SIGMOD, pp. 235–245. ACM (1982)Google Scholar
  10. 10.
    Ganguly, S., Hasan, W., Krishnamurthy, R.: Query optimization for parallel execution. In: Proceedings of the SIGMOD, pp. 9–18 (1992)Google Scholar
  11. 11.
    Giannikis, G., Alonso, G., Kossmann, D.: Shareddb: killing one thousand queries with one stone. Proc. VLDB Endow. 5(6), 526–537 (2012)CrossRefGoogle Scholar
  12. 12.
    Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining Knowl. Discov. 1(1), 29–53 (1997)CrossRefGoogle Scholar
  13. 13.
    Herodotou, H., Borisov, N., Babu, S.: Query optimization techniques for partitioned tables. In: Proceedings of the SIGMOD, pp. 49–60. ACM (2011)Google Scholar
  14. 14.
    Kemper, A., Neumann, T.: Hyper: a hybrid OLTP & OLAP main memory database system based on virtual memory snapshots. In: Proceedings of ICDE, pp. 195–206. IEEE (2011)Google Scholar
  15. 15.
    Kossmann, D., Franklin, M.J., Drasch, G., Ag, W.: Cache investment: integrating query optimization and distributed data placement. ACM TODS 25(4), 517–558 (2000)CrossRefMATHGoogle Scholar
  16. 16.
    Manegold, S., Boncz, P., Kersten, M.L.: Optimizing main-memory join on modern hardware. IEEE TKDE 14(4), 709–730 (2002)Google Scholar
  17. 17.
    Manegold, S., Boncz, P., Kersten, M.L.: Generic database cost models for hierarchical memory systems. In Proceedings of VLDB, VLDB 2002, pp. 191–202. VLDB Endowment (2002)Google Scholar
  18. 18.
    Neumann, T.: Efficiently compiling efficient query plans for modern hardware. Proc. VLDB Endow. 4(9), 539–550 (2011)CrossRefGoogle Scholar
  19. 19.
    Polyzotis, N.: Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization. In: Proceedings of CIKM, pp. 720–727. ACM (2005)Google Scholar
  20. 20.
    Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. ACM SIGMOD Record 29, 475–486 (2000)CrossRefGoogle Scholar
  21. 21.
    Ren, Q., Dunham, M.H., Kumar, V.: Semantic caching and query processing. IEEE TKDE 15(1), 192–210 (2003)Google Scholar
  22. 22.
    Sellis, T.K.: Multiple-query optimization. ACM TODS 13(1), 23–52 (1988)CrossRefGoogle Scholar
  23. 23.
    Zhang, H., Chen, G., Ooi, B.C., Tan, K.-L., Zhang, M.: In-memory big data management and processing: a survey. IEEE TKDE 27(7), 1920–1948 (2015)Google Scholar
  24. 24.
    Zhang, H., Chen, G., Ooi, B.C., Wong, W.-F., Wu, S., Xia, Y.: Anti-caching-based elastic memory management for big data. In: Proceedings of ICDE, pp. 1268–1279. IEEE (2015)Google Scholar
  25. 25.
    Zhang, Y., Zhou, X., Zhang, Y., Zhang, Y., Su, M., Wang, S.: Virtual denormalization via array index reference for main memory OLAP. IEEE TKDE 28(4), 1061–1074 (2016)Google Scholar
  26. 26.
    Zhou, J., Larson, P.-A., Chaiken, R.: Incorporating partitioning and parallel plans into the scope optimizer. In Proceedings of ICDE, pp. 1060–1071. IEEE (2010)Google Scholar
  27. 27.
    Zukowski, M., van de Wiel, M., Boncz, P.: Vectorwise: a vectorized analytical dbms. In: Proceedings of ICDE, pp. 1349–1350. IEEE (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Anxuan Yu
    • 1
  • Qingzhong Meng
    • 1
  • Xuan Zhou
    • 1
  • Binyu Shen
    • 1
  • Yansong Zhang
    • 1
  1. 1.DEKE LabRenmin University of ChinaBeijingChina

Personalised recommendations