Advertisement

Eavesdropper: A Framework for Detecting the Location of the Processed Result in Hadoop

  • Chuntao Dong
  • Qingni ShenEmail author
  • Wenting Li
  • Yahui Yang
  • Zhonghai Wu
  • Xiang Wan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9543)

Abstract

Hadoop has become increasingly popular as it rapidly processes big data in parallel, while security mechanisms have been introduced or studied for Hadoop. In addition, other security issues that should not be neglected still exist. Data leakage is one of the major security challenges. This paper studies the vulnerability of authorization mechanism of services in Hadoop and the threat of information leakage. Some authorization mechanism allow all users to access services by default, an adversary can utilize these services to collect information of other users. We design and implement Eavesdropper, a framework which utilizes k-means clustering to address the nodes that store the processed results. We conduct a comprehensive of experiments, which clearly demonstrate that our detection framework is capable of detecting the nodes that store the results.

Keywords

Hadoop MapReduce YARN Security Data leakage k-means 

Notes

Acknowledgment

This work is supported by the National High Technology Research and Development Program (“863” Program) of China under Grant No. 2015AA016009, the National Natural Science Foundation of China under Grant No. 61232005, and the Science and Technology Program of Shen Zhen, China under Grant No. JSGG2014051 6162852628.

References

  1. 1.
    Apache hadoop. http://hadoop.apache.org
  2. 2.
    Sharma, A., Kalbarczyk, Z., Barlow, J.: Analysis of security data from a large computing organization. In: IEEE 41st International Conference on Dependable Systems Networks, pp. 506–517. IEEE (2011)Google Scholar
  3. 3.
  4. 4.
    Ulusoy, H., Colombo, P., Ferrari, E.: GuardMR: fine-grained security policy enforcement for mapreduce system. In: ASIACCS 2015, pp. 285–296. ACM (2015)Google Scholar
  5. 5.
    Lahmer, I., Zhang, N.: MapReduce: MR model abstraction for future security study. In: International Conference on Security of Information and Networks, pp. 392–398. ACM (2014)Google Scholar
  6. 6.
    Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C 28(1), 100–108 (1979). Wiley for the Royal Statistical SocietyzbMATHGoogle Scholar
  7. 7.
    Dean, J., Ghenmawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI 2004, pp.137–150. ACM (2004)Google Scholar
  8. 8.
    Huang, J., Nicol, D.M., Campbell, R.H.: Denial-of-Service threat to hadoop/YARN clusters with multi-tenancy. In: IEEE International Congress on Big Data (2014)Google Scholar
  9. 9.
    Smith, K.T.: Big Data Security: The Evolution of Hadoop’s Security Model (2013)Google Scholar
  10. 10.
    O’Malley, O., Zhang, K., Radia, S.: Hadoop security design. In: Yahoo! Tech Rep (2009)Google Scholar
  11. 11.
    White, T.: Hadoop: The Definitve Guide, 3rd Edition, pp. 43–79. O’Reilly, Sebastopol (2012)Google Scholar
  12. 12.
    Vavilapalli, V.K., Murthy, A.C., Douglas, C.: Apache hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, vol. 5. ACM (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Chuntao Dong
    • 1
    • 2
  • Qingni Shen
    • 1
    • 2
    Email author
  • Wenting Li
    • 1
    • 2
  • Yahui Yang
    • 1
    • 2
  • Zhonghai Wu
    • 1
    • 2
  • Xiang Wan
    • 1
  1. 1.School of Software and MicroelectronicsPeking UniversityBeijingChina
  2. 2.MoE Key Lab of Network and Software AssurancePeking UniversityBeijingChina

Personalised recommendations