Abstract
In this chapter, we addressed the security issue in a Hadoop cluster when some nodes are compromised. We investigated the impact of attacks on the completion time of a MapReduce job when a node is compromised in a Hadoop cluster. We studied three attack methods: (1) blocking all incoming data from the master node except for the special messages that relay the status of the slave node, (2) delaying the delivery of packets that are sent to the master node, and performing an attack such as denial-of-service attack against the master node. To understand the impact of these attacks, we implemented them on different cluster settings that consist of three, six, and nine slave nodes and a single mater node in our testbed. Our data shows these attacks can affect the performance of MapReduce by increasing the computing time of MapReduce jobs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19:171–209. doi:10.1007/s11036-013-0489-0, Available: http://mmlab.snu.ac.kr/~mchen/min_paper/BigDataSurvey2014.pdf. [Accessed 1 May 2015]
Laney D (2001) 3-D data management: controlling data volume, velocity, and variety. META Group Original Research Note. [Online] Available: http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf. Accessed 1 May 2015
Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. John Murray, London
Hendler J (2013) Broad data: exploring the emerging web of data. Mary Ann Liebert, Inc., publishers. [Online]. doi: 10.1089/big.2013.1506. Available: http://online.liebertpub.com/doi/full/10.1089/big.2013.1506. Accessed 1 May 2015
Monreal-Feil L (2011) IBM study: digital era transforming CMO’s agenda, revealing gap in readiness. IBM. [Online]. Available: http://www-03.ibm.com/press/us/en/pressrelease/35633.wss. Accessed 1 May 2015
Kaisler S, Armour F, Espinosa JA, Money W (2013) Big data: issues and challenges moving forward. In: Proceedings of 46th Hawaii international conference on system sciences. [Online]. Available: http://www.cse.hcmut.edu.vn/~ttqnguyet/Downloads/SIS/References/Big%20Data/%282%29%20Kaisler2013%20-%20Big%20Data-%20Issues%20and%20Challenges%20Moving%20Forward.pdf. Accessed 2 May 2015
Dorband JE, Raytheon JP, Ranawake U (2002) Commodity computing clusters at Goddard Space Flight Center. In: Proceedings of Earth Science Technology Conference 2002. [Online]. Available: http://esto.nasa.gov/conferences/estc-2002/Papers/A6P6%28Dorband%29.pdf. Accessed 2 May 2015
Welcome to Apache™ Hadoop®! (2015) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/. Accessed 2 May 2015
EMC Education Services (2015) Data science and big data analytics: discovering, analyzing, visualizing, and presenting data. Wiley, Indianapolis
Henschen D (2014) 16 top big data analytics platforms. InformationWeek. [Online]. Available: http://www.informationweek.com/big-data/big-data-analytics/16-top-big-data-analytics-platforms/d/d-id/1113609. Accessed 2 May 2015
Amazon EMR. Amazon Web Services. [Online]. Available: http://aws.amazon.com/elasticmapreduce/. Accessed 2 May 2015
Asay M (2014) Why the world’s Hadoop installation may soon become the norm. TechRepublic. [Online]. Available: http://www.techrepublic.com/article/why-the-worlds-largest-hadoop-installation-may-soon-become-the-norm/. Accessed 10 May 2015
Sagiroglu S, Sinanc D (2013) Big data: a review. In: 2013 International conference on collaboration technologies and systems, San Diego, CA, pp 42–47
Tankard C (2012) Big data security. Netw Secur 2012(7):5–8
Sharma PP, Navedti CP (2014) Securing big data Hadoop: a review of security issues, threats, and solution. Int J Comput Sci Inform Technol 5(2):2126–2131. [Online]. Available: http://www.ijcsit.com/docs/Volume%205/vol5issue02/ijcsit20140502263.pdf. Accessed 4 May 2015
Hadoop cluster setup (2014) The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html. Accessed 2 May 2015
HDFS architecture (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html. Accessed 2 May 2015
HDFS users guide (2014) The Apache Software Foundation [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html. Accessed 4 May 2015
MapReduce tutorial (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html. Accessed 2 May 2015
Apache Hadoop nextgen MapReduce (YARN) (2014). The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/YARN.html. Accessed 2 May 2015
Hadoop in secure mode (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SecureMode.html. Accessed 10 May 2015
Transparent encryption in HDFS (2014) The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html. Accessed 10 May 2015
Diaz A. (2015) Hadoop 2.6 and native encryption-at-rest. DZone. [Online]. Available: http://java.dzone.com/articles/hadoop-encryption-rest. Accessed 10 May 2015
The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml. Accessed 3 May 2015
The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml. Accessed 3 May 2015
The Apache Software Foundation. [Online]. Available: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/core-default.xml. Accessed 3 May 2015
The Apache Software Foundation. [Online]. Available: https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml. Accessed 3 May 2015
Eddy WM (2015) Defenses against TCP SYN flooding attacks. Cisco. [Online]. Available: http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_9-4/syn_flooding_attacks.html. Accessed 10 May 2015
Jay (2014) How to install Hadoop on Ubuntu 13.10. DigitalOcean. [Online]. Available: https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10. Accessed 2 May 2015
Linux Foundation (2009) netem. Linux Foundation. [Online]. Available: http://www.linuxfoundation.org/collaborate/workgroups/networking/netem. Accessed 2 May 2015
blackMORE Ops. (2015) Denial-of-service attack – DoS using hping3 with spoofed IP in Kalix Linux. blakcMore Ops. [Online]. Available: http://www.blackmoreops.com/2015/04/21/denial-of-service-attack-dos-using-hping3-with-spoofed-ip-in-kali-linux/. Accessed 3 May 2015
TCP SYN packet with data. Cisco. [Online]. Available: http://tools.cisco.com/security/center/viewIpsSignature.x?signatureId=1314&signatureSubId=0&softwareVersion=6.0&releaseVersion=S272. Accessed 3 May 2015
Index of /other/static_html_dumps/September_2007/ceb/. (2007) Wikimedia. [Online]. Available: http://dumps.wikimedia.org/other/static_html_dumps/September_2007/ceb/. Accessed 18 April 2015
Dean A (2013) Dealing with Hadoop’s small file problem. Snowplow Analytics Limited. [Online]. Available: http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-small-files-problem/. Accessed 18 April 2015
Venner J, Wadkar S, Siddalingaiah M (2014) Pro Apache Hadoop. Apress, New York
Hansen CA (2015) Optimizing Hadoop for the cluster. Institute for Computer Science, University of Tromsø, Norway. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.486.1265&rep=rep1&type=pdf. Accessed 2 June 2015
Kaisler S, Armour F, Espinosa J. A, Money W (2013) Big data: issues and challenges moving forward. In: 2013 46th Hawaii international conference on system sciences (HICSS), IEEE. [Online]. Available: http://www.computer.org/csdl/proceedings/hicss/2013/4892/00/4892a995.pdf. Accessed 10 Jul 2015
Lafuente G (2015) The big data security challenge. Netw Secur 2015(1):12–14
Chen S, Wang G, Jia W (2015) κ-FuzzyTrust: efficient trust computation for large-scale mobile social networks using a fuzzy implicit social graph. Inf Sci 318:123–143
Ge L, Zhang H, Xu G, Yu W, Chen C, Blasch EP (2015) Towards MapReduce based machine learning techniques for processing massive network threat monitoring data, to appear in networking for big data. CRC Press & Francis Group, USA
Xu G, Yu W, Chen Z, Zhang H, Moulema P, Fu X, Lu C (2015) A cloud computing based system for network security management. Int J Parallel Emergent Distrib Syst 30(1):29–45
Yu W, Xu G, Chen Z, Moulema P (2013) A cloud computing based architecture for cyber security situation awareness. In: Proceedings of 4th international workshop on security and privacy in cloud computing (SPCC), Washington DC, USA
Acknowledgement
This work was also supported in part by US National Science Foundation (NSF) under grants: CNS 1117175 and 1350145. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Glenn, W., Yu, W. (2016). Cyber Attacks on MapReduce Computation Time in a Hadoop Cluster. In: Yu, S., Guo, S. (eds) Big Data Concepts, Theories, and Applications . Springer, Cham. https://doi.org/10.1007/978-3-319-27763-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-27763-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27761-5
Online ISBN: 978-3-319-27763-9
eBook Packages: Computer ScienceComputer Science (R0)