Skip to main content

Data Privacy

Part of the Studies in Big Data book series (SBD,volume 46)

Abstract

In this chapter we present an overview of the topic data privacy. We review privacy models and measures of disclosure risk. These models and measures provide computational definitions of what privacy means, and of how to evaluate the privacy level of a data set. Then, we give a summary of data protection mechanisms. We provide a classification of these methods according to three dimensions: whose privacy is being sought, the computations to be done, and the number of data sources. Finally, we describe masking methods. Such methods are the data protection mechanisms used for databases when the data use is undefined and the protected database is required to be useful for several data uses. We also provide a definition of information loss (or data utility) for this type of data protection mechanism. The chapter finishes with a summary.

Keywords

  • Data Protection Mechanisms
  • Disclosure Risk
  • Masking Method
  • Privacy Model
  • Differential Privacy

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-97556-6_7
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   119.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-97556-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   159.99
Price excludes VAT (USA)
Hardcover Book
USD   159.99
Price excludes VAT (USA)
Fig. 7.1

References

  1. Anandan, B., Clifton, C., Jiang, W., Murugesan, M., Pastrana-Camacho, P., & Si, L. (2012). t-Plausibility: Generalizing words to desensitize text. Transactions on Data Privacy, 5(3), 505–534.

    Google Scholar 

  2. Brand, R. (2002). Microdata protection through noise addition. In Inference control in statistical databases (pp. 97–116). Springer.

    Google Scholar 

  3. Casas-Roma, J., Herrera-joancomartí, J., & Torra, V. (2013). Analyzing the impact of edge modifications on networks. In: The 10th International Conference on Modeling Decisions for Artificial Intelligence (Vol. 8234, pp. 296–307). Lecture notes in computer science. Springer.

    Google Scholar 

  4. Cano, I., & Torra, V. (2009). Generation of synthetic data by means of fuzzy c-regression. In Proceedings of IEEE International Conference on Fuzzy Systems (pp. 1145–1150).

    Google Scholar 

  5. Chaum, D. L. (1981). Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 24(2), 5.

    CrossRef  Google Scholar 

  6. Defays, D., & Nanopoulos, P. (1993). Panels of enterprises and confidentiality: The small aggregates method. In Proceedings of the 1992 Symposium on Design and Analysis of Longitudinal Surveys, Ottawa: Statistics Canada (pp. 195–204).

    Google Scholar 

  7. Domingo-Ferrer, J., & Torra, V. (2001). A quantitative comparison of disclosure control methods for microdata. In Confidentiality, disclosure and data access: Theory and practical applications for statistical agencies (pp. 111–134).

    Google Scholar 

  8. Domingo-Ferrer, J., Mateo-Sanz, J. M., & Torra, V. (2001). Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In Pre-proceedings of ETK-NTTS, 2001 (Vol. 2, pp. 807–826).

    Google Scholar 

  9. Domingo-Ferrer, J., & Mateo-Sanz, J. M. (2002). Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1), 189–201.

    CrossRef  Google Scholar 

  10. Domingo Ferrer, J., Solanas, A., & Castellà Roca, J. (2009). h(k) private information retrieval from privacy uncooperative queryable databases. Online Information Review, 33(4), 720–744.

    CrossRef  Google Scholar 

  11. Duncan, G. T., Elliot, M., & Salazar, J. J. (2011). Statistical confidentiality. Springer.

    Google Scholar 

  12. Dwork, C. (2006). Differential privacy. In Proceedings of ICALP 2006 (Vol. 4052, pp. 1–12). LNCS.

    Google Scholar 

  13. Dwork, C. (2008). Differential privacy: A survey of results. In Proceedings of TAMC 2008 (Vol. 4978, pp. 1–19). LNCS.

    Google Scholar 

  14. Fienberg, S. E., Makov, U. E., & Steele, R. J. (1998). Disclosure limitation using perturbation and related methods for categorical data. Journal of Official Statistics, 14(4), 485–502.

    Google Scholar 

  15. Howe, D., & Nissenbaum, H. (2009). TrackMeNot: Resisting surveillance in web search. In Lessons from the identity trail: Anonymity, privacy, and identity in a networked society. Oxford University Press.

    Google Scholar 

  16. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E. S., Spicer, K., & de Wolf, P. -P. (2012). Statistical disclosure control. Wiley.

    Google Scholar 

  17. Juàrez, M., & Torra, V. (2015). DisPA: An intelligent agent for private web search In G. Navarro-Arribas, V. Torra (Eds.), Advanced research on data privacy (pp. 389–405). Springer.

    Google Scholar 

  18. Kim, J. J., & Winkler, W. E. (2003). Multiplicative noise for masking continuous data (Research Report Series No. Statistics #2003-01). Statistical Research Division. U.S. Bureau of the Census.

    Google Scholar 

  19. Kooiman, P., Willenborg, L., & Gouweleeuw, J. (1998). PRAM: A method for disclosure limitation of microdata. Research Report, Voorburg: Statistics Netherlands.

    Google Scholar 

  20. Lee, J., & Clifton, C. (2011). How much is enough? Choosing \(\epsilon \) for differential privacy. In Proceeding of ISC 2011 (Vol. 7001, pp. 325–340). LNCS

    Google Scholar 

  21. Li, N., Lyu, M., Su, D., & Yang, W. (2016). Differential privacy: From theory to practice. Morgan and Claypool Publishers.

    Google Scholar 

  22. Navarro Arribas, G., & Torra, V. (2010). Privacy preserving data mining through Microaggregation for Webbased E-commerce. Internet Research, 20(3), 366–84.

    CrossRef  Google Scholar 

  23. Moore, R., (1996). Controlled data swapping techniques for masking public use microdata sets. U. S. Bureau of the Census (unpublished manuscript).

    Google Scholar 

  24. Mülle, Y., Clifton, C., & Böhm, K. (2015). Privacy-integrated graph clustering through differential privacy. In EDBT/ICDT Workshops (pp. 247–254).

    Google Scholar 

  25. Navarro-Arribas, G., Torra, V., Erola, A., & Castellà-Roca, J. (2012). User K-Anonymity for privacy preserving data mining of query logs. Information Processing & Management, 48(3): 476–487. (May 2012).

    CrossRef  Google Scholar 

  26. Nettleton, D. F. (2012). Information loss evaluation based on fuzzy and crisp clustering of graph statistics. IEEE International Conference on Fuzzy Systems (pp. 1–8).

    Google Scholar 

  27. Raghunathan, T. J., Reiter, J. P., & Rubin, D. (2003). Multiple imputation for statistical disclosure limitation. Journal of Official Statistics, 19(1), 1–16.

    Google Scholar 

  28. Reiter, M. K., & Rubin, A. D. (1998). Crowds: Anonymity for web transactions. ACM Transactions on Information and System Security, 1(1), 66–92.

    CrossRef  Google Scholar 

  29. Sakuma, J., & Osame, T. (2018). Recommendation with k-Anonymized Ratings. Transactions on Data Privacy, 11(1), 47–60.

    Google Scholar 

  30. Samarati, P., & Sweeney, L. (1998). Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Rep: SRI Intl. Tech.

    Google Scholar 

  31. Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.

    CrossRef  Google Scholar 

  32. Sánchez, D., & Batet, M. (2017). Toward sensitive document release with privacy guarantees. Engineering Applications of Artificial Intelligence, 59(Supplement C), 23–34.

    CrossRef  Google Scholar 

  33. Stokes, K., & Bras-Amorós, M. (2011). On query self-submission in peer-to-peer user-private information retrieval. In Proceedings of 4th PAIS 2011.

    Google Scholar 

  34. Stokes, K., & Farràs, O. (2014). Linear spaces and transversal designs: \(k\)-anonymous combinatorial configurations for anonymous database search. Designs, Codes and Cryptography, 71, 503–524.

    Google Scholar 

  35. Sweeney, L. (2002). Achieving \(k\)-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 571–588.

    MathSciNet  CrossRef  Google Scholar 

  36. Torra, V. (2017). Data privacy. Springer.

    Google Scholar 

  37. Torra, V., & Navarro-Arribas, G. (2016). Integral privacy. In Proceedings of CANS 2016 (Vol. 10052, pp. 661–669). LNCS.

    Google Scholar 

  38. Vaidya, J., Clifton, C. W., & Zhu, Y. M. (2006). Privacy preserving data mining. Springer.

    Google Scholar 

  39. Van den Hout, A. (2004). Analyzing misclassified data: Randomized response and post randomization. Ph.D. thesis, Utrecht University.

    Google Scholar 

  40. Willenborg, L., & de Waal, T. (2001). Elements of statistical disclosure control. Springer.

    Google Scholar 

  41. Winkler, W. E. (2004). Masking and re-identification methods for public-use microdata: Overview and research problems. In Privacy in statistical databases (pp. 231–246). Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vicenç Torra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Torra, V., Navarro-Arribas, G., Stokes, K. (2019). Data Privacy. In: Said, A., Torra, V. (eds) Data Science in Practice. Studies in Big Data, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-319-97556-6_7

Download citation