Skip to main content

Cyber Attacks on MapReduce Computation Time in a Hadoop Cluster

  • Chapter
  • First Online:
Big Data Concepts, Theories, and Applications

Abstract

In this chapter, we addressed the security issue in a Hadoop cluster when some nodes are compromised. We investigated the impact of attacks on the completion time of a MapReduce job when a node is compromised in a Hadoop cluster. We studied three attack methods: (1) blocking all incoming data from the master node except for the special messages that relay the status of the slave node, (2) delaying the delivery of packets that are sent to the master node, and performing an attack such as denial-of-service attack against the master node. To understand the impact of these attacks, we implemented them on different cluster settings that consist of three, six, and nine slave nodes and a single mater node in our testbed. Our data shows these attacks can affect the performance of MapReduce by increasing the computing time of MapReduce jobs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209. doi:10.1007/s11036-013-0489-0, Available: http://mmlab.snu.ac.kr/~mchen/min_paper/BigDataSurvey2014.pdf. [Accessed 1 May 2015]

    Article  MathSciNet  Google Scholar 

  2. Laney D (2001) 3-D data management: controlling data volume, velocity, and variety. META Group Original Research Note. [Online] Available: http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf. Accessed 1 May 2015

  3. Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. John Murray, London

    Google Scholar 

  4. Hendler J (2013) Broad data: exploring the emerging web of data. Mary Ann Liebert, Inc., publishers. [Online]. doi: 10.1089/big.2013.1506. Available: http://online.liebertpub.com/doi/full/10.1089/big.2013.1506. Accessed 1 May 2015

    Google Scholar 

  5. Monreal-Feil L (2011) IBM study: digital era transforming CMO’s agenda, revealing gap in readiness. IBM. [Online]. Available: http://www-03.ibm.com/press/us/en/pressrelease/35633.wss. Accessed 1 May 2015

  6. Kaisler S, Armour F, Espinosa JA, Money W (2013) Big data: issues and challenges moving forward. In: Proceedings of 46th Hawaii international conference on system sciences. [Online]. Available: http://www.cse.hcmut.edu.vn/~ttqnguyet/Downloads/SIS/References/Big%20Data/%282%29%20Kaisler2013%20-%20Big%20Data-%20Issues%20and%20Challenges%20Moving%20Forward.pdf. Accessed 2 May 2015

  7. Dorband JE, Raytheon JP, Ranawake U (2002) Commodity computing clusters at Goddard Space Flight Center. In: Proceedings of Earth Science Technology Conference 2002. [Online]. Available: http://esto.nasa.gov/conferences/estc-2002/Papers/A6P6%28Dorband%29.pdf. Accessed 2 May 2015

  8. Welcome to Apache™ Hadoop®! (2015) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/. Accessed 2 May 2015

  9. EMC Education Services (2015) Data science and big data analytics: discovering, analyzing, visualizing, and presenting data. Wiley, Indianapolis

    Book  Google Scholar 

  10. Henschen D (2014) 16 top big data analytics platforms. InformationWeek. [Online]. Available: http://www.informationweek.com/big-data/big-data-analytics/16-top-big-data-analytics-platforms/d/d-id/1113609. Accessed 2 May 2015

  11. Amazon EMR. Amazon Web Services. [Online]. Available: http://aws.amazon.com/elasticmapreduce/. Accessed 2 May 2015

  12. Asay M (2014) Why the world’s Hadoop installation may soon become the norm. TechRepublic. [Online]. Available: http://www.techrepublic.com/article/why-the-worlds-largest-hadoop-installation-may-soon-become-the-norm/. Accessed 10 May 2015

  13. Sagiroglu S, Sinanc D (2013) Big data: a review. In: 2013 International conference on collaboration technologies and systems, San Diego, CA, pp 42–47

    Google Scholar 

  14. Tankard C (2012) Big data security. Netw Secur 2012(7):5–8

    Article  Google Scholar 

  15. Sharma PP, Navedti CP (2014) Securing big data Hadoop: a review of security issues, threats, and solution. Int J Comput Sci Inform Technol 5(2):2126–2131. [Online]. Available: http://www.ijcsit.com/docs/Volume%205/vol5issue02/ijcsit20140502263.pdf. Accessed 4 May 2015

  16. Hadoop cluster setup (2014) The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html. Accessed 2 May 2015

  17. HDFS architecture (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html. Accessed 2 May 2015

  18. HDFS users guide (2014) The Apache Software Foundation [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html. Accessed 4 May 2015

  19. MapReduce tutorial (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html. Accessed 2 May 2015

  20. Apache Hadoop nextgen MapReduce (YARN) (2014). The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/YARN.html. Accessed 2 May 2015

  21. Hadoop in secure mode (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SecureMode.html. Accessed 10 May 2015

  22. Transparent encryption in HDFS (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html. Accessed 10 May 2015

  23. Diaz A. (2015) Hadoop 2.6 and native encryption-at-rest. DZone. [Online]. Available: http://java.dzone.com/articles/hadoop-encryption-rest. Accessed 10 May 2015

  24. The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml. Accessed 3 May 2015

  25. The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml. Accessed 3 May 2015

  26. The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/core-default.xml. Accessed 3 May 2015

  27. The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml. Accessed 3 May 2015

  28. Eddy WM (2015) Defenses against TCP SYN flooding attacks. Cisco. [Online]. Available: http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_9-4/syn_flooding_attacks.html. Accessed 10 May 2015

  29. Jay (2014) How to install Hadoop on Ubuntu 13.10. DigitalOcean. [Online]. Available: https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10. Accessed 2 May 2015

  30. Linux Foundation (2009) netem. Linux Foundation. [Online]. Available: http://www.linuxfoundation.org/collaborate/workgroups/networking/netem. Accessed 2 May 2015

  31. blackMORE Ops. (2015) Denial-of-service attack – DoS using hping3 with spoofed IP in Kalix Linux. blakcMore Ops. [Online]. Available: http://www.blackmoreops.com/2015/04/21/denial-of-service-attack-dos-using-hping3-with-spoofed-ip-in-kali-linux/. Accessed 3 May 2015

  32. TCP SYN packet with data. Cisco. [Online]. Available: http://tools.cisco.com/security/center/viewIpsSignature.x?signatureId=1314&signatureSubId=0&softwareVersion=6.0&releaseVersion=S272. Accessed 3 May 2015

  33. Index of /other/static_html_dumps/September_2007/ceb/. (2007) Wikimedia. [Online]. Available: http://dumps.wikimedia.org/other/static_html_dumps/September_2007/ceb/. Accessed 18 April 2015

  34. Dean A (2013) Dealing with Hadoop’s small file problem. Snowplow Analytics Limited. [Online]. Available: http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-small-files-problem/. Accessed 18 April 2015

  35. Venner J, Wadkar S, Siddalingaiah M (2014) Pro Apache Hadoop. Apress, New York

    Google Scholar 

  36. Hansen CA (2015) Optimizing Hadoop for the cluster. Institute for Computer Science, University of Tromsø, Norway. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.486.1265&rep=rep1&type=pdf. Accessed 2 June 2015

  37. Kaisler S, Armour F, Espinosa J. A, Money W (2013) Big data: issues and challenges moving forward. In: 2013 46th Hawaii international conference on system sciences (HICSS), IEEE. [Online]. Available: http://www.computer.org/csdl/proceedings/hicss/2013/4892/00/4892a995.pdf. Accessed 10 Jul 2015

  38. Lafuente G (2015) The big data security challenge. Netw Secur 2015(1):12–14

    Article  MathSciNet  Google Scholar 

  39. Chen S, Wang G, Jia W (2015) κ-FuzzyTrust: efficient trust computation for large-scale mobile social networks using a fuzzy implicit social graph. Inf Sci 318:123–143

    Article  MathSciNet  Google Scholar 

  40. Ge L, Zhang H, Xu G, Yu W, Chen C, Blasch EP (2015) Towards MapReduce based machine learning techniques for processing massive network threat monitoring data, to appear in networking for big data. CRC Press & Francis Group, USA

    Google Scholar 

  41. Xu G, Yu W, Chen Z, Zhang H, Moulema P, Fu X, Lu C (2015) A cloud computing based system for network security management. Int J Parallel Emergent Distrib Syst 30(1):29–45

    Article  Google Scholar 

  42. Yu W, Xu G, Chen Z, Moulema P (2013) A cloud computing based architecture for cyber security situation awareness. In: Proceedings of 4th international workshop on security and privacy in cloud computing (SPCC), Washington DC, USA

    Google Scholar 

Download references

Acknowledgement

This work was also supported in part by US National Science Foundation (NSF) under grants: CNS 1117175 and 1350145. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Glenn, W., Yu, W. (2016). Cyber Attacks on MapReduce Computation Time in a Hadoop Cluster. In: Yu, S., Guo, S. (eds) Big Data Concepts, Theories, and Applications . Springer, Cham. https://doi.org/10.1007/978-3-319-27763-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27763-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27761-5

  • Online ISBN: 978-3-319-27763-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics