Skip to main content

Design of Big Data Privacy Framework—A Balancing Act

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 612))

Abstract

Technological advancements in the field of Big Data and IoT have led to unprecedented growth in digital data. Data is collected from multiple distributed sources by business organizations, government agencies, and healthcare sectors. Data collected is mined to uncover valuable data, and the insights they provide are used by these organizations for optimized decision making. Data thus amassed may also contain sensitive personal information of individuals that are at risk of disclosure during analytics. Hence, there is a need for a privacy-aware system that enforces sensitive data protection. But such a system constrains the usefulness of data. Study shows that although significant findings do exist for balancing these contradicting objectives, the efficacy and scalability of these solutions continue to challenge the research community, given the volume of Big Data. Assessing the appropriate blend of these objectives for mutual benefit of organizations and customers requires leveraging the benefit of the modern tools and technologies in the Big Data ecosystem. This research study extensively reviews the previous work in the direction of privacy preserved Big Data analytics, and the review is first of its kind in exploring the challenges that have to be overcome in striking a balance between data value, privacy, scalability, and performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Xu L, Jiang C, Wang J, Yuan J, Ren Y (2014) Information security in Big Data: privacy and data mining. IEEE Trans 2

    Google Scholar 

  2. Mendes R, Vilela JP (2017) Privacy-preserving data mining: methods, metrics, and applications. IEEE Trans 5

    Article  Google Scholar 

  3. Vennila S, Priyadarshini J (2015) Scalable privacy preservation in Big Data: a survey. Procedia Comput Sci 50:369–373

    Article  Google Scholar 

  4. Mehta BB, Rao UP (2016) Privacy preserving unstructured big data analytics: issues and challenges. Procedia Comput Sci 78:120–124

    Article  Google Scholar 

  5. Zhao Y, Du M, Le J, Luo Y (2009) A survey on privacy preserving approaches in data publishing. In: Proceedings of IEEE 1st international workshop on database technology and application, Apr 2009, pp 128–131

    Google Scholar 

  6. Aggarwal CC, Yu PS (2008) A general survey of privacy-preserving data mining models and algorithms. In: Privacy-preserving data mining. Springer, New York, NY, USA, pp 11–52

    Chapter  Google Scholar 

  7. Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33(1):50–57

    Article  Google Scholar 

  8. Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):571–588

    Article  MathSciNet  Google Scholar 

  9. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and ℓ-diversity. Citiseer

    Google Scholar 

  10. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(5), 557–570

    Article  MathSciNet  Google Scholar 

  11. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) l-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discovery Data 1(1):3

    Article  Google Scholar 

  12. Abdelhameed SA, Moussa SM, Khalifa ME (2018) Privacy-preserving tabular data publishing: a comprehensive evaluation from web to cloud. Comput Secur 72:74

    Article  Google Scholar 

  13. Narendra Kumar NV, Shyamasundar RK (2016) An end-to-end privacy preserving design of a map-reduce framework. In: 2016 IEEE 18th international conference on high performance computing and communications

    Google Scholar 

  14. Blass EO, Di Pietro R, Molva R, Önen M (2012) PRISM – privacy-preserving search in MapReduce. In: Fischer-Hübner S, Wright M (eds) Privacy enhancing technologies. PETS 2012. LNCS, vol 7384. Springer, Berlin, Heidelberg

    Chapter  Google Scholar 

  15. Solé M, Muntés-Mulero V, Nin J (2012) Efficient microaggregation techniques for large numerical data volumes. Int J Inf Secur 11:253. https://doi.org/10.1007/s10207-012-0158-5

    Article  Google Scholar 

  16. Zhang X, Dou W, Pei J, Nepal S, Yang C, Liu C, Chen J (2014) Proximity-aware local-recoding anonymization with MapReduce for scalable big data privacy preservation in cloud

    Google Scholar 

  17. A MapReduce based approach of scalable multidimensional anonymization for big data privacy preservation on cloud

    Google Scholar 

  18. Lefevre K, Dewitt DJ, Ramakrishnan R (2008) Workload-aware anonymization techniques for large-scale datasets. ACM Trans Database Syst 33(3):Article 17

    Article  Google Scholar 

  19. Zhang X, Yang LT, Liu C, Chen J (2014) A scalable two-phase top-down specialization approach for data anonymization using MapReduce on cloud. IEEE Trans Parallel Distrib Syst 25(2)

    Google Scholar 

  20. Sedayao J, Bhardwaj R, Gorade N (2014) Making big data, privacy, and anonymization work together in the enterprise: experiences and issues. In: IEEE international congress on big data, pp 1–7

    Google Scholar 

  21. Wang S, Sinnott RO (2017) Protecting personal trajectories of social media users through differential privacy. Comput Secur

    Google Scholar 

  22. Zhang C, Chang E, Yap (2014) RHC Tagged-MapReduce: a general framework for secure computing with mixed-sensitivity data on hybrid clouds. In: 14th IEEE/ACM international symposium on cluster, cloud and grid computing, CCGrid, 2014

    Google Scholar 

  23. Al-Zobbi M, Shahrestani S, Ruan C (2017) Implementing a framework for big data anonymity and analytics access control. In: 2017 IEEE Trustcom/BigDataSE/ICESS

    Google Scholar 

  24. www.corporatestrategy.com

  25. Fung BCM, Wang K, Yu PS (2005) Top-down specialization for information and privacy preservation

    Google Scholar 

  26. Fan L, Jin H (2015) A practical framework for privacy-preserving data analytics. In: Proceedings of the 24th international conference on World Wide Web (WWW ‘15). International World Wide Web conferences steering committee, Republic and Canton of Geneva, Switzerland, pp 311–321

    Google Scholar 

  27. Dinh TTA, Saxena P, Chang E-C, Ooi BC, Zhang C (2015) M2R: enabling stronger privacy in MapReduce computation. In: 24th USENIX security symposium, 12–14 Aug 2015

    Google Scholar 

  28. Khan SM, Hamlen KW, Kantarcioglu M (2014) Silver lining: enforcing secure information flow at the cloud edge. In: 2014 IEEE international conference on cloud engineering, Boston, 2014. IEEE Computer Society, pp 37–46

    Google Scholar 

  29. Roy I, Setty STV, Kilzer A, Shmatikov V, Witchel E (2010) Airavat: security and privacy for MapReduce. In: Proceedings of the 7th USENIX symposium on networked systems design and implementation, NSDI 2010, 28–30 April 2010, San Jose, CA, USA. USENIX Association, pp 297–312

    Google Scholar 

  30. Liu W, Selcuk Uluagac A, Beyah R (2014) MACA: a privacy-preserving multi-factor cloud authentication system utilizing big data. In: IEEE INFOCOM workshops, pp 518–523

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Geetha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Geetha, P., Naikodi, C., Setty, S.L.N. (2020). Design of Big Data Privacy Framework—A Balancing Act. In: Jain, V., Chaudhary, G., Taplamacioglu, M., Agarwal, M. (eds) Advances in Data Sciences, Security and Applications. Lecture Notes in Electrical Engineering, vol 612. Springer, Singapore. https://doi.org/10.1007/978-981-15-0372-6_19

Download citation

Publish with us

Policies and ethics