Health and Technology

, Volume 9, Issue 1, pp 65–75 | Cite as

Improving privacy preservation policy in the modern information age

  • John S. DavisIIEmail author
  • Osonde Osoba
Original Paper


Anonymization or de-identification techniques are methods for protecting the privacy of human subjects in sensitive data sets while preserving the utility of those data sets. In the case of health data, anonymization techniques may be used to remove or mask patient identities while allowing the health data content to be used by the medical and pharmaceutical research community. The efficacy of anonymization methods has come under repeated attacks and several researchers have shown that anonymized data can be re-identified to reveal the identity of the data subjects via approaches such as “linking.” Nevertheless, even given these deficiencies, many government privacy policies depend on anonymization techniques as the primary approach to preserving privacy. In this report, we survey the anonymization landscape and consider the range of anonymization approaches that can be used to de-identify data containing personally identifiable information. We then review several notable government privacy policies that leverage anonymization. In particular, we review the European Union’s General Data Protection Regulation (GDPR) and show that it takes a more goal-oriented approach to data privacy. It defines data privacy in terms of desired outcome (i.e., as a defense against risk of personal data disclosure), and is agnostic to the actual method of privacy preservation. And GDPR goes further to frame its privacy preservation regulations relative to the state of the art, the cost of implementation, the incurred risks, and the context of data processing. This has potential implications for the GDPR’s robustness to future technological innovations – very much in contrast to privacy regulations that depend explicitly on more definite technical specifications.


Privacy Digital privacy Data privacy Data utility Anonymization De-identification Data management HIPAA GDPR 



We would like to thank Marjory Blumenthal and Rebecca Balebako for their detailed and thoughtful review of early drafts of this document. We are immensely grateful for their comments and feedback. Any errors contained herein are our own and should not be attributed to them.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.


  1. 1.
    Tanner, Adam, Our Bodies, Our Data: How Companies Make Billions Selling Our Medical Records, Beacon Press. 2017.Google Scholar
  2. 2.
    G Cormode, D Srivastava. Anonymized Data: Generation, Models, Usage. SIGMOD, Providence, Rhode Island. 2009.Google Scholar
  3. 3.
    Dalenius T. Finding a needle in a haystack: identifying anonymous census records. J Off Stat. 1986;2(3):329–36.Google Scholar
  4. 4.
    L Sweeney. Uniqueness of Simple Demographics in the U.S. Population , LIDAPWP4. Carnegie Mellon University, Laboratory for International Data Privacy, Pittsburgh, PA. Forthcoming book entitled, The Identifiability of Data. 2000.Google Scholar
  5. 5.
    de Montjoye Y-A, Radaelli L, Singh VK. Unique in the shopping mall: on the reidentifiability of credit card metadata. Science. 2015;347(6221):536–9.CrossRefGoogle Scholar
  6. 6.
    Adam Tanner. Harvard Professor Re-Identifies Anonymous Volunteers In DNA Study,” Forbes, April 25, 2013.
  7. 7.
    Michael Barbaro and Tom Zeller. A Face Is Exposed for AOL Searcher No. 4417749, New York Times. 2006.Google Scholar
  8. 8.
    A. Narayanan, V. Shmatikov. Robust De-anonymization of Large Sparse Datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP '08). IEEE Computer Society, Washington, 2008, 111–125..Google Scholar
  9. 9.
    Clifton C, Tassa T. On syntactic anonymity and differential privacy. Trans Data Privacy. 2013;6(2):161–83.MathSciNetGoogle Scholar
  10. 10.
    Sweeney L. K-anonymity: a model for protecting privacy. Int J Uncertainty, Fuzziness Knowledge-Based Syst. 2002;10(05):557–70.MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Dondi R, Mauri G, Zoppis I. The l-diversity problem: tractability and approximability. Theor Comput Sci. 2013;511:159–71.MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Truta TM, Campan A, Meyer P. Generating microdata with p-sensitive k-anonymity property. Berlin: Springer; 2007. p. 124–41.Google Scholar
  13. 13.
    Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. L-diversity: privacy beyond k-anonymity. ACM Trans Knowled Discov Data (TKDD). 2007;1(1):3.CrossRefGoogle Scholar
  14. 14.
    Li, Ninghui, Tiancheng Li, and Suresh Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE, 2007.Google Scholar
  15. 15.
    Domingo-Ferrer, Josep, and Vicenç Torra. "A critique of k-anonymity and some of its enhancements." Availability, Reliability and Security, 2008. ARES 08. Third International Conference on. IEEE, 2008.Google Scholar
  16. 16.
    Bonizzoni, P, Gianluca Della Vedova, and Riccardo Dondi. "The k-anonymity problem is hard." Fundamentals of Computation Theory. Springer Berlin Heidelberg, 2009.Google Scholar
  17. 17.
    LeFevre, Kristen, David J. DeWitt, and Raghu Ramakrishnan. "Incognito: Efficient full-domain k-anonymity." Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005.Google Scholar
  18. 18.
    LeFevre, Kristen, David J. DeWitt, and Raghu Ramakrishnan. "Mondrian multidimensional k-anonymity." Data Engineering, 2006. ICDE'06. Proceedings of the 22nd International Conference on. IEEE, 2006.Google Scholar
  19. 19.
    Liang, H, H Yuan. "On the complexity of t-closeness anonymization and related problems." In Database Systems for Advanced Applications, pp. 331-345. Springer Berlin Heidelberg, 2013.Google Scholar
  20. 20.
    Cao J, et al. SABRE: a sensitive attribute Bucketization and REdistribution framework for t-closeness. VLDB J. 2011;20(1):59–81.CrossRefGoogle Scholar
  21. 21.
    Cynthia Dwork. Differential privacy: a survey of results. In Proceedings of the 5th international conference on Theory and applications of models of computation (TAMC'08), Manindra Agrawal, Dingzhu Du, Zhenhua Duan, and Angsheng Li (Eds.). Springer-Verlag, Berlin, Heidelberg, 2008, 1-19.Google Scholar
  22. 22.
    Dwork C. An ad omnia approach to defining and achieving private data analysis. In: Bonchi F, Ferrari E, Malin B, Saygin Y, editors. Proceedings of the 1st ACM SIGKDD international conference on privacy, security, and trust in KDD (PinKDD'07). Berlin, Heidelberg: Springer-Verlag; 2007. p. 1–13.Google Scholar
  23. 23.
    Frank McSherry and Kunal Talwar. 2007. Mechanism Design via Differential Privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS '07). IEEE Computer Society, Washington, DC, USA, 94-103.Google Scholar
  24. 24.
    Soria-Comas, Jordi, and Josep Domingo-Ferrer. Differential privacy via t-closeness in data publishing. Privacy, Security and Trust (PST), 2013 Eleventh Annual International Conference on. IEEE, 2013.Google Scholar
  25. 25.
    Sarathy R, Muralidhar K. Evaluating Laplace noise addition to satisfy differential privacy for numeric data. Trans Data Privacy. 2011;4(1):1–17.MathSciNetGoogle Scholar
  26. 26.
    Leoni, D. (2012), Non-interactive differential privacy: a survey., in Guillaume Raschia & Martin Theobald, ed., 'WOD' , ACM, , pp. 40-52.Google Scholar
  27. 27.
    G. Cormode, M. Procopiuc, D. Srivastava, and T. Tran. Differentially private publication of sparse data. In International Conference on Database Theory (ICDT), 2012.Google Scholar
  28. 28.
    Ohm P. Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Rev. 2010;57:1701.Google Scholar
  29. 29.
    Zayatz L. Disclosure avoidance practices and research at the US Census Bureau: an update. J Off Stat. 2007;23(2):253.Google Scholar
  30. 30.
    Klarreich, Erica. "Privacy by the Numbers: A New Approach to Safeguarding Data." Quanta Magazine. Quanta Magazine, 2012. Web. <>.
  31. 31.
    Chawla S, et al. Toward privacy in public databases. Berlin: Theory of Cryptography Springer; 2005. p. 363–85.zbMATHGoogle Scholar
  32. 32.
    Roscorla, Tanya. “3 Student Data Privacy Bills That Congress Could Act On.” Center for Digital Education March 24, 2016,
  33. 33.
    Daries JP, Reich J, Waldo J, Young EM, Whittinghill J, Seaton DT, et al. Privacy, anonymity, and big data in the social sciences. Queue. 2014;12(7):30. 12 pagesGoogle Scholar
  34. 34.
    Access to Classified Information, Executive Order #12968, August 4, 1995,
  35. 35.
    Dana Priest and William M. Arkin, “A hidden world, growing beyond control,” Washington Post – Top Secret America,
  36. 36.
    “White House orders review of 5 million security clearances,” Nov 22, 2013,
  37. 37.
    Gentry C, Halevi S. Implementing Gentry's fully-homomorphic encryption scheme. In: Paterson KG, editor. Proceedings of the 30th annual international conference on theory and applications of cryptographic techniques: advances in cryptology (EUROCRYPT'11). Berlin: Springer-Verlag; 2011. p. 129–48.Google Scholar
  38. 38.
    Lindell Y, Pinkas B. Privacy preserving data mining. In: Bellare M, editor. Proceedings of the 20th annual international cryptology conference on advances in cryptology (CRYPTO '00). London: Springer-Verlag; 2000. p. 36–54.Google Scholar

Copyright information

© IUPESM and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.RAND CorporationSanta MonicaUSA

Personalised recommendations