Towards Privacy for MapReduce on Hybrid Clouds Using Information Dispersal Algorithm

  • Asma Ben Cheikh
  • Heithem Abbes
  • Gilles Fedak
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8648)

Abstract

MapReduce is a powerful model for parallel data processing. The motivation of this work is to allow running map-reduce jobs partially on untrusted infrastructures, such as public clouds and desktop grid, while using a trusted infrastructure, such as private cloud, to ensure that no outsider could get the ’entire’ information. Our idea is to break data into meaningless chunks and spread them on a combination of public and private clouds so that the compromise would not allow the attacker to reconstruct the whole data-set. To realize this, we use the Information Dispersion Algorithms (IDA), which allows to split a file into pieces so that, by carefully dispersing the pieces, there is no method for a single node to reconstruct the data if it cannot collaborate with other nodes. We propose a protocol that allows MapReduce computing nodes to exchange the data and perform IDA-aware MapReduce computation. We conduct experiments on the Grid’5000 testbed and report on performance evaluation of the prototype.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Abr90]
    Eggers, K.W., LaPadula, L.J., Olson, I.M., Abrams, M.D.: A generalized framework for access control: an informal description. In: Proceedings of the 13th NIST-NCSC National Computer Security Conference, pp. 135–143 (1990)Google Scholar
  2. [CF12]
    Christophe Cérin and Gilles Fedak. Desktop grid Computing. CRC Press (June 2012)Google Scholar
  3. [DFM00]
    Dingledine, R., Freedman, M.J., Molnar, D.: The free haven project: Distributed anonymous storage service. In: Federrath, H. (ed.) Designing Privacy Enhancing Technologies. LNCS, vol. 2009, pp. 67–95. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. [DG04]
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, OSDI 2004, vol. 6, p. 10. USENIX Association, Berkeley (2004)Google Scholar
  5. [Dwo10]
    Dwork, C.: Differential privacy in new settings. In: Charikar, M. (ed.) Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 174–183. SIAM (2010)Google Scholar
  6. [Dwo11]
    Dwork, C.: Differential privacy. In: Encyclopedia of Cryptography and Security, 2nd edn., pp. 338–340 (2011)Google Scholar
  7. [McC04]
    McCarty, B.: Selinux: Nsa’s open source security enhanced linux. O’Reilly and Associates (2004)Google Scholar
  8. [Rab89]
    Rabin, M.O.: Efficient dispersal of information for security, load balancing, and fault tolerance. J. ACM 36(2), 335–348 (1989)CrossRefMATHMathSciNetGoogle Scholar
  9. [Rab90]
    Rabin, M.O.: The information dispersal algorithm and its applications. In: Capocelli, R.M. (ed.) Sequences, pp. 406–419. Springer, New York (1990)CrossRefGoogle Scholar
  10. [RSK+10]
    Roy, I., Setty, S.T.V., Kilzer, A., Shmatikov, V., Witchel, E.: Airavat: security and privacy for mapreduce. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI 2010, p. 20. USENIX Association, Berkeley (2010)Google Scholar
  11. [TMC+10]
    Tang, B., Moca, M., Chevalier, S., He, H., Fedak, G.: Towards mapreduce for desktop grid computing. In: Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC 2010, pp. 193–200. IEEE Computer Society, Washington, DC (2010)CrossRefGoogle Scholar
  12. [WDYG09]
    Wei, W., Du, J., Yu, T., Gu, X.: Securemr: A service integrity assurance framework for mapreduce. In: Proceedings of the 2009 Annual Computer Security Applications Conference, ACSAC 2009, pp. 73–82. IEEE Computer Society, Washington, DC (2009)CrossRefGoogle Scholar
  13. [Whi09]
    White, T.: Hadoop: The Definitive Guide. Definitive Guide Series. O’Reilly Media, Incorporated (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Asma Ben Cheikh
    • 1
  • Heithem Abbes
    • 1
  • Gilles Fedak
    • 2
  1. 1.Université de Tunis, LaTICE, ENSITTunisTunisie
  2. 2.Université de Lyon, INRIAFrance

Personalised recommendations