Skip to main content

Modern Privacy Risks and Protection Strategies in Data Analytics

  • Conference paper
  • First Online:
Soft Computing and Signal Processing

Abstract

Technology has transformed our socioeconomic life. We are more dependent on technology for our day-to-day activities like banking, e-commerce, retail, etc. Extensive usage of smart phone apps and enormous interest of people for social media has led to a digital data rich environment where significantly large-scale data is being generated and shared by organizations for better decision making and foster businesses through data analytics. However, data analytics involves privacy threats leading to disclosure of personal and sensitive data without the user's consent. Conventional data analytics involved analytics on the data as a whole using aggregate queries. Modern applications like recommendation systems, digital marketing, etc. involves analytics on person-specific individual records which is more harmful to individual privacy. In this paper, we examine various privacy-related risks, privacy preservation strategies with their potentials and limitations, also highlight the important aspects of many privacy legislations made in various countries including Personal Data Protection bill of India and European Union's General Data Protection Regulation (GDPR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. P. Ducange, R. Pecori, P. Mezzina, A glimpse on big data analytics in the framework of marketing strategies. Soft. Comput. 22(1), 325–342 (2018)

    Article  Google Scholar 

  2. A. Chauhan, K. Kummamuru, D. Toshniwal, Prediction of places of visit using tweets. Knowl. Inf. Syst. 50(1), 145–166 (2017)

    Google Scholar 

  3. D. Yang, B. Qu, P. Cudré-Mauroux,Privacy-preserving social media data publishing for personalized ranking-based recommendation. IEEE Trans. Knowl. Data Eng. 31(3), 507–520 (2018)

    Google Scholar 

  4. Y. Liu, et al.,A practical privacy-preserving data aggregation (3PDA) scheme for smart grid. IEEE Trans. Ind. Inf. 15(3), 1767–1774 (2018)

    Google Scholar 

  5. G.T. Duncan, et al.,Disclosure limitation methods and information loss for tabular data, in Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies (2001), pp. 35–166.

    Google Scholar 

  6. G.T. Duncan, D. Lambert, Disclosure-limited data dissemination. J. Am. Stat. Assoc. 81(393), 10–18 (1986)

    Article  Google Scholar 

  7. D. Lambert, Measures of disclosure risk and harm. J. Off. Stat. 9, 313 (1993)

    Google Scholar 

  8. K. Spiller, et al., Data privacy: users’ thoughts on quantified self personal data, in Self-Tracking (Palgrave Macmillan, Cham, 2018), pp. 111–124

    Google Scholar 

  9. M. Hettig, et al.:Visualizing risk by example: demonstrating threats arising from android apps, in Symposium on Usable Privacy and Security (SOUPS) (2013)

    Google Scholar 

  10. P.R.M. Rao, S. Murali Krishna, A.P. Siva Kumar, Privacy preservation techniques in big data analytics: a survey. J. Big Data 5(1), 33 (2018)

    Google Scholar 

  11. V.S. Iyengar, Transforming data to satisfy privacy constraints, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)

    Google Scholar 

  12. K. LeFevre, D.J. DeWitt, R. Ramakrishnan, Incognito: efficient full-domain k-anonymity, in Proceedings of the 2005 ACM SIGMOD International Conference on Management of data (2005)

    Google Scholar 

  13. K. LeFevre, D.J. DeWitt, R. Ramakrishnan, Mondrian multidimensional k-anonymity, in 22nd International Conference on Data Engineering (ICDE'06) (IEEE, 2006)

    Google Scholar 

  14. P. Samarati, L. Sweeney,Protecting Privacy When Disclosing Information: K-anonymity and its Enforcement Through Generalization and Suppression (1998)

    Google Scholar 

  15. L. Sweeney, Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(05), 571–588 (2002)

    Article  MathSciNet  Google Scholar 

  16. L. Sweeney, k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  17. R. Williams, On the complexity of optimal k-anonymity, in Proceedings of 23rd ACM SIGMOD-SIGACT-SIGART Symposium Principles of Database Systems (PODS) (ACM, New York, 2004)

    Google Scholar 

  18. X. Xiao, Y. Tao, Personalized privacy preservation, in Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (2006)

    Google Scholar 

  19. Y. Rubner, C. Tomasi, L.J. Guibas, The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision 40(2), 99–121 (2000)

    Article  Google Scholar 

  20. C.C. Aggarwal, S. Yu Philip, A general survey of privacy-preserving data mining models and algorithms, in Privacy-Preserving Data Mining (Springer, Boston, MA, 2008), pp. 11–52

    Google Scholar 

  21. R. Jiang, R. Lu, K.K. Choo, Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data. Future Gen. Comput. Syst. 78, 392–401 (2018)

    Article  Google Scholar 

  22. K. Wang, P.S. Yu, S. Chakraborty, Bottom-up generalization: a data mining solution to privacy protection, in Fourth IEEE International Conference on Data Mining (ICDM'04) (IEEE, 2004), pp. 249–256

    Google Scholar 

  23. B.C.M. Fung, K. Wang, S.Y. Philip, Top-down specialization for information and privacy preservation, in 21st International Conference on Data Engineering (ICDE'05) (IEEE, 2005)

    Google Scholar 

  24. X. Zhang, et al.: A MapReduce based approach of scalable multidimensional anonymization for big data privacy preservation on cloud, in Third International Conference on Cloud and Green Computing (CGC) (IEEE, Piscataway, 2013)

    Google Scholar 

  25. M. Al-Zobbi, S. Shahrestani, C. Ruan, Improving MapReduce privacy by implementing multi-dimensional sensitivity-based anonymization. J. Big Data 4(1), 45 (2017)

    Google Scholar 

  26. C. Schneider, IBM Blogs (2016). https://www.ibm.com/blogs/watson/2016/05/biggest-data-challenges-might-not-even-know

  27. TCS, Emphasizing the Need for Government Regulations on Data Privacy (2016). https://www.tcs.com/content/dam/tcs/pdf/technologies/Cyber-Security/Abstract/Strengthening-Privacy-Protection-with-the-European-General-Data-Protection-Regulation.pdf

  28. X. He et al., Qoe-driven big data architecture for smart city. IEEE Commun. Mag. 56(2), 88–93 (2018)

    Article  Google Scholar 

  29. R. Ramakrishnan, et al., Azure data lake store: a hyperscale distributed file service for big data analytics, in Proceedings of the 2017 ACM International Conference on Management of Data (2017)

    Google Scholar 

  30. A. Beheshti, et al.,Coredb: a data lake service, in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017)

    Google Scholar 

  31. T. Shang, et al., A DP Canopy K-means algorithm for privacy preservation of Hadoop platform, in International Symposium on Cyberspace Safety and Security (Springer, Cham, 2017)

    Google Scholar 

  32. Q. Jia, et al., Preserving model privacy for machine learning in distributed systems. IEEE Trans. Parallel Distrib. Syst. 29(8), 1808–1822 (2018)

    Google Scholar 

  33. I. Psychoula, et al., A Deep Learning Approach for Privacy Preservation in Assisted Living. arXiv preprint arXiv:1802.09359 (2018)

  34. M. Guller, Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large Scale Data Analysis (Apress, New York, 2015).

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vasupula, N., Munnangi, V., Daggubati, S. (2022). Modern Privacy Risks and Protection Strategies in Data Analytics. In: Reddy, V.S., Prasad, V.K., Wang, J., Reddy, K.T.V. (eds) Soft Computing and Signal Processing. Advances in Intelligent Systems and Computing, vol 1340. Springer, Singapore. https://doi.org/10.1007/978-981-16-1249-7_9

Download citation

Publish with us

Policies and ethics