Skip to main content

k m-Anonymity for Continuous Data Using Dynamic Hierarchies

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8744))

Abstract

Many organizations, enterprises or public services collect and manage personal data of individuals. These data contain knowledge that is of substantial value for scientists and market experts, but carelessly disseminating them can lead to significant privacy breaches, as they might reveal financial, medical or other personal information. Several anonymization methods have been proposed to allow the privacy preserving sharing of datasets with personal information. Anonymization techniques provide a trade-off between the strength of the privacy guarantee and the quality of the anonymized dataset. In this work we focus on the anonymization of sets of values from continuous domains, e.g., numerical data, and we provide a method for protecting the anonymized data from attacks against identity disclosure. The main novelty of our approach is that instead of using a fixed, given generalization hierarchy, we let the anonymization algorithm decide how different values will be generalized. The benefit of our approach is twofold: a) we are able to generalize datasets without requiring an expert to define the hierarchy and b) we limit the information loss, since the proposed algorithm is able to limit the scope of the generalization. We provide a series of experiments that demonstrate the gains in terms of information quality of our algorithm compared to the state-of-the-art.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving Anonymization of Set-valued Data. PVLDB 1(1) (2008)

    Google Scholar 

  2. Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. IJUFKS 10(5) (2002)

    Google Scholar 

  3. Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. The VLDB Journal 20(1), 83–106 (2011)

    Article  Google Scholar 

  4. Terrovitis, M., Mamoulis, N., Liagouris, J., Skiadopoulos, S.: Privacy preservation by disassociation. Proceedings of the VLDB Endowment 5(10), 944–955 (2012)

    Article  Google Scholar 

  5. Meyerson, A., Williams, R.: On the Complexity of Optimal K-anonymity. In: PODS, pp. 223–228 (2004)

    Google Scholar 

  6. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD, pp. 1–12 (2000)

    Google Scholar 

  7. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.: Utility-Based Anonymization Using Local Recoding. In: KDD, pp. 785–790 (2006)

    Google Scholar 

  8. Uci repository, http://archive.ics.uci.edu/ml/datasets.html

  9. Uci repository us census data 1990 data set (1990), http://archive.ics.uci.edu/ml/datasets/US+Census+Data+%281990%29

  10. Samarati, P., Sweeney, L.: Generalizing Data to Provide Anonymity when Disclosing Information (abstract). In: PODS (see also Technical Report SRI-CSL-98-04) (1998)

    Google Scholar 

  11. Samarati, P.: Protecting respondents identities in microdata release. TKDE 13(6), 1010–1027 (2001)

    Google Scholar 

  12. Sweeney, L.: Datafly: A system for providing anonymity in medical data. In: Proc. of the International Conference on Database Security, pp. 356–381 (1998)

    Google Scholar 

  13. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: SIGKDD, pp. 279–288. ACM (2002)

    Google Scholar 

  14. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(05), 571–588 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  15. Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: ICDM, pp. 249–256. IEEE (2004)

    Google Scholar 

  16. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60. ACM (2005)

    Google Scholar 

  17. LeFevre, K., DeWitt, D.-J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE, p. 25. IEEE (2006)

    Google Scholar 

  18. Fung, B.C., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: ICDE, pp. 205–216. IEEE (2005)

    Google Scholar 

  19. Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE, pp. 217–228. IEEE (2005)

    Google Scholar 

  20. El Emam, K., Dankar, F.K., Issa, R., Jonker, E., Amyot, D., Cogo, E., Corriveau, J.-P., Walker, M., Chowdhury, S., Vaillancourt, R., et al.: A globally optimal k-anonymity method for the de-identification of health data. Journal of the American Medical Informatics Association 16(5), 670–682 (2009)

    Article  Google Scholar 

  21. Kohlmayer, F., Prasser, F., Eckert, C., Kemper, A., Kuhn, K.A.: Flash: efficient, stable and optimal k-anonymity. In: PASSAT, SocialCom, pp. 708–717. IEEE (2012)

    Google Scholar 

  22. Zhang, Q., Koudas, N., Srivastava, D., Yu, T.: Aggregate query answering on anonymized tables. In: ICDE, pp. 116–125. IEEE (2007)

    Google Scholar 

  23. Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 211–222. ACM (2003)

    Google Scholar 

  24. Verykios, V.S., Elmagarmid, A.K., Bertino, E., Saygin, Y., Dasseni, E.: Association rule hiding. TKDE 16(4), 434–447 (2004)

    Google Scholar 

  25. Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)

    Article  MathSciNet  Google Scholar 

  26. Domingo-Ferrer, J., Solanas, A., Martinez-Balleste, A.: Privacy in statistical databases: k-anonymity through microaggregation. In: GrC, pp. 774–777 (2006)

    Google Scholar 

  27. Domingo-Ferrer, J.: Microaggregation: achieving k-anonymity with quasi-optimal data quality. In: European Conference on Quality in Survey Statistics (2006)

    Google Scholar 

  28. Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: VLDB, pp. 139–150. VLDB Endowment (2006)

    Google Scholar 

  29. Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: A new approach for privacy preserving data publishing. TKDE 24(3), 561–574 (2012)

    Google Scholar 

  30. Casino, F., Patsakis, C., Puig, D., Solanas, A.: On privacy preserving collaborative filtering: Current trends, open problems, and new issues. In: e-Business Engineering (ICEBE), pp. 244–249. IEEE (2013)

    Google Scholar 

  31. Casino, F., Domingo-Ferrer, J., Patsakis, C., Puig, D., Solanas, A.: Privacy preserving collaborative filtering with k-anonymity through microaggregation. In: e-Business Engineering (ICEBE), pp. 490–497. IEEE (2013)

    Google Scholar 

  32. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. TKDD 1(1), 3 (2007)

    Article  Google Scholar 

  33. Liu, J., Wang, K.: On optimal anonymization for l + -diversity. In: ICDE, pp. 213–224. IEEE (2010)

    Google Scholar 

  34. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE, pp. 106–115. IEEE (2007)

    Google Scholar 

  35. Cao, J., Karras, P.: Publishing microdata with a robust privacy guarantee. Proceedings of the VLDB Endowment 5(11), 1388–1399 (2012)

    Article  Google Scholar 

  36. Wang, K., Fung, B.: Anonymizing sequential releases. In: SIGKDD, pp. 414–423. ACM (2006)

    Google Scholar 

  37. Gionis, A., Mazza, A., Tassa, T.: k-anonymization revisited. In: ICDE, pp. 744–753. IEEE (2008)

    Google Scholar 

  38. Wong, W.K., Mamoulis, N., Cheung, D.W.L.: Non-homogeneous generalization in privacy preserving data publishing. In: SIGMOD, pp. 747–758. ACM (2010)

    Google Scholar 

  39. Tassa, T., Mazza, A., Gionis, A.: k-concealment: An alternative model of k-type anonymity. Transactions on Data Privacy 5(1), 189–222 (2012)

    MathSciNet  Google Scholar 

  40. Stokes, K., Torra, V.: n-confusion: a generalization of k-anonymity. In: EDBT/ICDT Workshops, pp. 211–215. ACM (2012)

    Google Scholar 

  41. Ghinita, G., Tao, Y., Kalnis, P.: On the Anonymization of Sparse High-Dimensional Data. In: ICDE (2008)

    Google Scholar 

  42. Zigomitros, A., Solanas, A., Patsakis, C.: The role of inference in the anonymization of medical records. In: Computer-Based Medical Systems, CBMS (2014)

    Google Scholar 

  43. Xu, Y., Wang, K., Fu, A.W.-C., Yu, P.S.: Anonymizing transaction databases for publication. In: ACM SIGKDD, pp. 767–775. ACM (2008)

    Google Scholar 

  44. Gkountouna, O., Lepenioti, K., Terrovitis, M.: Privacy against aggregate knowledge attacks. In: PrivDB, Data Engineering Workshops (ICDEW), pp. 99–103. IEEE (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Gkountouna, O., Angeli, S., Zigomitros, A., Terrovitis, M., Vassiliou, Y. (2014). k m-Anonymity for Continuous Data Using Dynamic Hierarchies. In: Domingo-Ferrer, J. (eds) Privacy in Statistical Databases. PSD 2014. Lecture Notes in Computer Science, vol 8744. Springer, Cham. https://doi.org/10.1007/978-3-319-11257-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11257-2_13

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11256-5

  • Online ISBN: 978-3-319-11257-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics