Advertisement

Comprehensive Analytics of Large Data Query Processing on Relational Database with SSDs

  • Keisuke Suzuki
  • Yuto Hayamizu
  • Daisaku Yokoyama
  • Miyuki Nakano
  • Masaru Kitsuregawa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8506)

Abstract

Solid-state drives (SSDs) are widely used in large data processing applications due to their higher random access throughput than HDDs and capability of parallel I/O processing. The I/O bottlenecks that HDDs on database systems face can be resolved by using SSDs because of these advantages. However, access latency on cache hierarchy may become a new bottleneck in SSD-based databases. In this study, we quantitatively analyzed the behavior of SSD-based databases by taking hashjoin operation. We found that cache misses in SSD-based databases can be decreased by reducing the hashtable size to fit into the cache. This is because the I/O cost is not increased by the high throughput of the SSDs, even though the hashjoin partition files are fragmented. We also observed that cache misses are not increased by taking a multi-hashjoin query. This is because the total size of multiple hashtables can fit into the cache size in SSD-based databases, which is in contrast to HDD-based databases, where hashtables require almost all of the available memory. Overall, our analytics clarify that the performance of multiple queries in SSD-based databases can be improved by considering data access locality of the hashjoin operation and determining the appropriate hashtable size to fit into the cache.

Keywords

RDBMS SSD Hashjoin OLAP 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bhattacharjee, B., Ross, K.A., Lang, C., Mihaila, G.A., Banikazemi, M.: Enhancing recovery using an SSD buffer pool extension. In: DaMoN 2011, pp. 10–16. ACM (2011)Google Scholar
  2. 2.
    Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: SSD bufferpool extensions for database systems. Proc. VLDB Endow. 1435–1446 (2010)Google Scholar
  3. 3.
    Kang, W.H., Lee, S.W., Moon, B.: Flash-based extended cache for higher throughput and faster recovery. Proc. VLDB Endow. 5(11), 1615–1626 (2012)CrossRefGoogle Scholar
  4. 4.
    Do, J., Zhang, D., Patel, J.M., De Witt, D.J., Naughton, J.F., Halverson, A.: Turbocharging DBMS buffer pool using SSDs. In: SIGMOD 2011, pp. 1113–1124. ACM (2011)Google Scholar
  5. 5.
    Koltsidas, I., Viglas, S.D.: Flashing up the storage layer. Proc. VLDB Endow. 1(1), 514–525 (2008)CrossRefGoogle Scholar
  6. 6.
    Luo, T., Lee, R., Mesnier, M., Chen, F., Zhang, X.: hStorage-DB: Heterogeneity-aware data management to exploit the full capability of hybrid storage systems. Proc. VLDB Endow. 5(10), 1076–1087 (2012)CrossRefGoogle Scholar
  7. 7.
    Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1-2), 1195–1206 (2010)CrossRefGoogle Scholar
  8. 8.
    Tsirogiannis, D., Harizopoulos, S., Shah, M.A., Wiener, J.L., Graefe, G.: Query Processing Techniques for Solid State Drives. In: SIGMOD 2009, pp. 59–72. ACM (2009)Google Scholar
  9. 9.
    Kitsuregawa, M., Tanaka, H., Moto-Oka, T.: Relational Algebra Machine GRACE. In: Goto, E., Furukawa, K., Nakajima, R., Nakata, I., Yonezawa, A. (eds.) RIMS 1982. LNCS, vol. 147, pp. 191–214. Springer, Heidelberg (1983)CrossRefGoogle Scholar
  10. 10.
    Schneider, D.A., De Witt, D.J.: A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment. In: SIGMOD 1989, pp. 110–121. ACM (1989)Google Scholar
  11. 11.
  12. 12.
    Transaction Processing Performance Council, An ad-hoc, decision support benchmark, http://www.tpc.org/tpch/
  13. 13.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Keisuke Suzuki
    • 1
  • Yuto Hayamizu
    • 1
  • Daisaku Yokoyama
    • 1
  • Miyuki Nakano
    • 2
  • Masaru Kitsuregawa
    • 1
    • 3
  1. 1.University of TokyoJapan
  2. 2.Shibaura Institute of technologyJapan
  3. 3.National Institute of InformaticsJapan

Personalised recommendations