On Minimality Attack for Privacy-Preserving Data Publishing
Preserving privacy while publishing data is an important requirement in many practical applications. Information about individuals and/or organizations are collected from various sources which are being published, after applying some kinds pre-processing logic, that may lead to leaking sensitive information of individual. Anonymization is a widely used technique to suppress or generalize data so that essence of data can be hidden to a certain degree. In this paper, we present an analysis of some well-known anonymization-based privacy preserving schemes such as k-anonymity and l-diversity to show how these schemes suffer from the minimality attack that can lead to potential information leakage from the published data. We present a mitigation mechanism, NoMin algorithm, to address the minimality attack in anonymization-based privacy preserving schemes. The proposed NoMin algorithm uses random sample of spurious records in an equivalence class of actual records such that an adversary cannot figure out an individual from the published data. The analysis and experimental results of the proposed algorithm illustrate its strengths, practicality and limitations with respect to minimality attacks on anonymization-based published data.
KeywordsPrivacy Anonymization k-anonymity l-diversity Minimality attack
This research was supported in part by the Indo-French Centre for the Promotion of Advanced Research (IFCPAR) and the Center Franco-Indien Pour La Promotion De La Recherche Avancée (CEFIPRA) through the project DST/CNRS 2015-03 under DST-INRIA-CNRS Targeted Programme.
- 3.LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain \(k\)-anonymity. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM (2005)Google Scholar
- 4.LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional \(k\)-anonymity. In: Proceedings of the International Conference on Data Engineering, p. 25. IEEE (2006)Google Scholar
- 6.Wong, R.C.W., Fu, A.W.C., Wang, K., Pei, J.: Minimality attack in privacy preserving data publishing. In: Proceedings of the International Conference on VLDB Endowment, pp. 543–554 (2007)Google Scholar
- 8.Hamza, N., Hefny, H.A.: Attacks on anonymization-based privacy-preserving: a survey for data mining and data publishing. Inf. Secur. 4(2), 101 (2013)Google Scholar
- 10.Adult Data Set, UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Adult
- 11.Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of International Conference Data Mining, pp. 99–106 (2003)Google Scholar
- 13.Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526 (2009)Google Scholar
- 14.ARX – Data Anonymization tool a comprehensive software for privacy-preserving microdata publishing. http://arx.deidentifier.org/overview/metrics-for-information-loss/
- 15.Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288 (2002)Google Scholar