Advertisement

Hadoop MapReduce Performance on SSDs: The Case of Complex Network Analysis Tasks

  • Marios Bakratsas
  • Pavlos Basaras
  • Dimitrios KatsarosEmail author
  • Leandros Tassiulas
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 529)

Abstract

This article investigates the relative performance of SSDs versus hard disk drives (HDDs) when they are used as underlying storage for Hadoop’s MapReduce. We examine MapReduce tasks and data suitable for performing analysis of complex networks which present different execution patterns. The obtained results confirmed in part earlier studies which showed that SSDs are beneficial to Hadoop; we also provide solid evidence that the processing pattern of the running application plays a significant role.

Keywords

Hard Disk Drive Magnetic Disk Solid State Drive Disk Type Mutual Friend 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgement

This work was supported by the Project “REDUCTION: Reducing Environmental Footprint based on Multi-Modal Fleet management System for Eco-Routing and Driver Behaviour Adaptation,” funded by the EU.ICT program, Challenge ICT-2011.7.

References

  1. 1.
    Chen, Y., Ganapathi, A., Griffith, R., Katz, R.: The case for evaluating MapReduce performance using workload suites. In: Proceedings of IEEE MASCOTS, pp. 390–399 (2011)Google Scholar
  2. 2.
    Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In: Proceedings of ICDE Workshops (2010)Google Scholar
  3. 3.
    Islam, N., Rahman, M., Jose, J., Rajachandrasekar, R., Wang, H., Subramoni, H., Murthy, C., Panda, D.: High performance RDMA-design of HDFS over InfiniBand. In: Proceedings of SC (2012)Google Scholar
  4. 4.
    Kambatla, K., Chen, Y.: The truth about MapReduce performance on SSDs. In: Proceedings of LISA, pp. 109–117 (2014)Google Scholar
  5. 5.
    Kang, S.-H., Koo, D.-H., Kang, W.-H., Lee, S.-W.: A case for flash memory SSD in Hadoop applications. Int. J. Control Autom. 6, 201–210 (2013)CrossRefGoogle Scholar
  6. 6.
    Krish, K.R., Iqbal, M.S., Butt, A.R.: VENU: orchestrating SSDs in Hadoop storage. In: Proceedings of IEEE BigData, pp. 207–212 (2014)Google Scholar
  7. 7.
    Min, C., Kim, K., Cho, H., Lee, S.-W., Eom, Y.I.: SFS: random write considered harmful in solid state drives. In: Proceedings of USENIX FAST (2012)Google Scholar
  8. 8.
    Moon, S., Lee, J., Kee, Y.S.: Introducing SSDs to the Hadoop MapReduce framework. In: Proceeding of IEEE CLOUD, pp. 272–279 (2014)Google Scholar
  9. 9.
    Saxena, P., Chou, J.: How much solid state drive can improve the performance of Hadoop cluster? Performance evaluation of Hadoop on SSD and HDD. Int. J. Mod. Commun. Technol. Res. 2(5), 1–7 (2014)Google Scholar
  10. 10.
    Sur, S., Wang, H., Huang, J., Ouyang, X., Panda, D.: Can high-performance interconnects benefit Hadoop distributed file system. In: Proceedings of the Workshop MASVDC (2010)Google Scholar
  11. 11.
    Wu, D., Xie, W., Ji, X., Luo, W., He, J., Wu, D.: Understanding the impacts of solid-state storage on the Hadoop performance. In: Proceedings of Advanced Cloud and Big Data, pp. 125–130 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Marios Bakratsas
    • 1
  • Pavlos Basaras
    • 1
  • Dimitrios Katsaros
    • 1
    • 2
    Email author
  • Leandros Tassiulas
    • 2
  1. 1.Department of Electrical and Computer EngineeringUniversity of ThessalyVolosGreece
  2. 2.Department of Electrical Engineering and Yale Institute for Network ScienceYale UniversityNew HavenUSA

Personalised recommendations