Skip to main content

A Privacy Preserving Platform for MapReduce

  • Conference paper
  • First Online:
Applications and Techniques in Information Security (ATIS 2017)

Abstract

Big data applications typically require a large number of clusters, running in parallel, to process data fast and more efficiently. This is typically controlled and managed by MapReduce. In MapReduce operations, Mappers transform input original key/value pairs to a set of intermediate key/value pairs while Reducers aggregate a set of intermediate values, compute and write to the output. The output however can bring serious privacy concerns. Firstly, the output can directly leak sensitive information because it contains the global view of the final computation. Secondly, the output can also indirectly leak information via composite attacks where the adversary can link it with public information published via different sources such as Facebook or Twitter. To address such privacy concerns, we propose a privacy preserving platform which can prevent privacy leakage in MapReduce. Our platform can be plugged into the Reduce phase to sanitize the final output in such a way that the privacy is preserved while it yet provides a high data utility. We demonstrate the feasibility of our platform by providing empirical studies and highlights that our proposal can be used for real life applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. To, Q.C., Nguyen, B., Pucheral, P.: TrustedMR: a trusted MapReduce system based on tamper resistance hardware. In: Debruyne, C., et al. (eds.) On the Move to Meaningful Internet Systems: OTM 2015. LNCS, vol. 9415, pp. 38–56. Springer, Cham (2015). doi:10.1007/978-3-319-26148-5_3

    Chapter  Google Scholar 

  2. Sweeny, L.: K-Anonymity: a model for protecting privacy. Int. J. Uncertainty Puzziness Knowledge-Based Syst. 10, 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Ninghui, L., Tiancheng, L., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and L-diversity. In: Proceedings of the International Conference on Data Engineering, pp. 106–115 (2007)

    Google Scholar 

  4. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: ?-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data. 1, 3–es (2007)

    Article  Google Scholar 

  5. Chen, C.-L., Pal, R., Golubchik, L.: Oblivious mechanisms in differential privacy: experiments, conjectures, and open questions. In: 2016 IEEE Security and Privacy Workshops, pp. 41–48 (2016)

    Google Scholar 

  6. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107 (2008)

    Article  Google Scholar 

  7. Sweeney, L.: Achieving K -anonymity privacy protection using generalization and suppression. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10, 1–18 (2002)

    MathSciNet  MATH  Google Scholar 

  8. Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54, 86 (2011)

    Article  Google Scholar 

  9. Dwork, C., Smith, A.: Differential privacy for statistic: what we know and what we want to learn. J. Priv. Confidentiality 1, 135–154 (2009)

    Google Scholar 

  10. Liu, F., Mathematics, C., Dame, N.: Generalized gaussian mechanism for differential privacy, pp. 1–29. arXiv. 46556 (2016)

    Google Scholar 

  11. Barthe, G., Gaboardi, M., Gregoire, B., Hsu, J., Strub, P.-Y.: Proving differential privacy via probabilistic couplings, pp. 1–10. arXiv. 1 (2016)

    Google Scholar 

  12. Gaboardi, M., Haeberlen, A., Hsu, J., Narayan, A., Pierce, B.C.: Linear dependent types for differential privacy. In: Popl 2013, vol. 48, pp. 357–370 (2013)

    Google Scholar 

  13. Ohrimenko, O., Costa, M., Fournet, C., Gkantsidis, C., Kohlweiss, M., Sharma, D.: Observing and preventing leakage in MapReduce. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communication Security - CCS 2015, pp. 1570–1581 (2015)

    Google Scholar 

  14. Chen, G., Cai, Q., Zhan, Y.: Approaches on personal data privacy preserving in cloud: a survey. In: Proceedings of The Third International Conference on Data Mining, Internet Computing, and Big Data, Konya, pp. 36–43. Turkey (2016)

    Google Scholar 

  15. Zhang, X., Liu, C., Nepal, S., Dou, W., Chen, J.: Privacy-preserving layer over MapReduce on cloud. In: Proceedings of the 2nd International Conference on Cloud Green Computing, 2nd International Conference on Society Computer Its Applications CGC/SCA 2012, pp. 304–310 (2012)

    Google Scholar 

  16. Roy, I., Setty, S.T.V.S.T.V., Kilzer, A., Shmatikov, V., Witchel, E.: Airavat: security and privacy for MapReduce. In: Proceedings of the 7th USENIX Conference on Networked System Design Implementation, vol. 19, pp. 20–20 (2010)

    Google Scholar 

  17. Tran, Q., Sato, H.: A solution for privacy protection in mapreduce. In: Proceeding of the International Computer Software Application Conference, pp. 515–520 (2012)

    Google Scholar 

  18. Douriez, M., Doraiswamy, H., Freire, J., Silva, C.T.: Anonymizing NYC taxi data: does it matter? In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 140–148. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sibghat Ullah Bazai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Bazai, S.U., Jang-Jaccard, J., Zhang, X. (2017). A Privacy Preserving Platform for MapReduce. In: Batten, L., Kim, D., Zhang, X., Li, G. (eds) Applications and Techniques in Information Security. ATIS 2017. Communications in Computer and Information Science, vol 719. Springer, Singapore. https://doi.org/10.1007/978-981-10-5421-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5421-1_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5420-4

  • Online ISBN: 978-981-10-5421-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics