Skip to main content

Towards Privacy for MapReduce on Hybrid Clouds Using Information Dispersal Algorithm

  • Conference paper
Data Management in Cloud, Grid and P2P Systems (Globe 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8648))

Abstract

MapReduce is a powerful model for parallel data processing. The motivation of this work is to allow running map-reduce jobs partially on untrusted infrastructures, such as public clouds and desktop grid, while using a trusted infrastructure, such as private cloud, to ensure that no outsider could get the ’entire’ information. Our idea is to break data into meaningless chunks and spread them on a combination of public and private clouds so that the compromise would not allow the attacker to reconstruct the whole data-set. To realize this, we use the Information Dispersion Algorithms (IDA), which allows to split a file into pieces so that, by carefully dispersing the pieces, there is no method for a single node to reconstruct the data if it cannot collaborate with other nodes. We propose a protocol that allows MapReduce computing nodes to exchange the data and perform IDA-aware MapReduce computation. We conduct experiments on the Grid’5000 testbed and report on performance evaluation of the prototype.

This work is supported by the French Agence Nationale de la Recherche through the MapReduce grant under contract ANR-10-SEGI- 001-01, as well as INRIA ARC BitDew.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Eggers, K.W., LaPadula, L.J., Olson, I.M., Abrams, M.D.: A generalized framework for access control: an informal description. In: Proceedings of the 13th NIST-NCSC National Computer Security Conference, pp. 135–143 (1990)

    Google Scholar 

  2. Christophe Cérin and Gilles Fedak. Desktop grid Computing. CRC Press (June 2012)

    Google Scholar 

  3. Dingledine, R., Freedman, M.J., Molnar, D.: The free haven project: Distributed anonymous storage service. In: Federrath, H. (ed.) Designing Privacy Enhancing Technologies. LNCS, vol. 2009, pp. 67–95. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  4. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, OSDI 2004, vol. 6, p. 10. USENIX Association, Berkeley (2004)

    Google Scholar 

  5. Dwork, C.: Differential privacy in new settings. In: Charikar, M. (ed.) Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 174–183. SIAM (2010)

    Google Scholar 

  6. Dwork, C.: Differential privacy. In: Encyclopedia of Cryptography and Security, 2nd edn., pp. 338–340 (2011)

    Google Scholar 

  7. McCarty, B.: Selinux: Nsa’s open source security enhanced linux. O’Reilly and Associates (2004)

    Google Scholar 

  8. Rabin, M.O.: Efficient dispersal of information for security, load balancing, and fault tolerance. J. ACM 36(2), 335–348 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  9. Rabin, M.O.: The information dispersal algorithm and its applications. In: Capocelli, R.M. (ed.) Sequences, pp. 406–419. Springer, New York (1990)

    Chapter  Google Scholar 

  10. Roy, I., Setty, S.T.V., Kilzer, A., Shmatikov, V., Witchel, E.: Airavat: security and privacy for mapreduce. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI 2010, p. 20. USENIX Association, Berkeley (2010)

    Google Scholar 

  11. Tang, B., Moca, M., Chevalier, S., He, H., Fedak, G.: Towards mapreduce for desktop grid computing. In: Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC 2010, pp. 193–200. IEEE Computer Society, Washington, DC (2010)

    Chapter  Google Scholar 

  12. Wei, W., Du, J., Yu, T., Gu, X.: Securemr: A service integrity assurance framework for mapreduce. In: Proceedings of the 2009 Annual Computer Security Applications Conference, ACSAC 2009, pp. 73–82. IEEE Computer Society, Washington, DC (2009)

    Chapter  Google Scholar 

  13. White, T.: Hadoop: The Definitive Guide. Definitive Guide Series. O’Reilly Media, Incorporated (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ben Cheikh, A., Abbes, H., Fedak, G. (2014). Towards Privacy for MapReduce on Hybrid Clouds Using Information Dispersal Algorithm. In: Hameurlain, A., Dang, T.K., Morvan, F. (eds) Data Management in Cloud, Grid and P2P Systems. Globe 2014. Lecture Notes in Computer Science, vol 8648. Springer, Cham. https://doi.org/10.1007/978-3-319-10067-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10067-8_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10066-1

  • Online ISBN: 978-3-319-10067-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics