On Minimality Attack for Privacy-Preserving Data Publishing
- 796 Downloads
Abstract
Preserving privacy while publishing data is an important requirement in many practical applications. Information about individuals and/or organizations are collected from various sources which are being published, after applying some kinds pre-processing logic, that may lead to leaking sensitive information of individual. Anonymization is a widely used technique to suppress or generalize data so that essence of data can be hidden to a certain degree. In this paper, we present an analysis of some well-known anonymization-based privacy preserving schemes such as k-anonymity and l-diversity to show how these schemes suffer from the minimality attack that can lead to potential information leakage from the published data. We present a mitigation mechanism, NoMin algorithm, to address the minimality attack in anonymization-based privacy preserving schemes. The proposed NoMin algorithm uses random sample of spurious records in an equivalence class of actual records such that an adversary cannot figure out an individual from the published data. The analysis and experimental results of the proposed algorithm illustrate its strengths, practicality and limitations with respect to minimality attacks on anonymization-based published data.
Keywords
Privacy Anonymization k-anonymity l-diversity Minimality attackNotes
Acknowledgment
This research was supported in part by the Indo-French Centre for the Promotion of Advanced Research (IFCPAR) and the Center Franco-Indien Pour La Promotion De La Recherche Avancée (CEFIPRA) through the project DST/CNRS 2015-03 under DST-INRIA-CNRS Targeted Programme.
References
- 1.Aggarwal, G., et al.: Anonymizing tables. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 246–258. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30570-5_17CrossRefGoogle Scholar
- 2.Kisilevich, S., Rokach, L., Elovici, Y., Shapira, B.: Efficient multidimensional suppression for \(k\)-anonymity. IEEE Trans. Knowl. Data Eng. 22(3), 334–347 (2010)CrossRefGoogle Scholar
- 3.LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain \(k\)-anonymity. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM (2005)Google Scholar
- 4.LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional \(k\)-anonymity. In: Proceedings of the International Conference on Data Engineering, p. 25. IEEE (2006)Google Scholar
- 5.Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: a new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 24(3), 561–574 (2012)CrossRefGoogle Scholar
- 6.Wong, R.C.W., Fu, A.W.C., Wang, K., Pei, J.: Minimality attack in privacy preserving data publishing. In: Proceedings of the International Conference on VLDB Endowment, pp. 543–554 (2007)Google Scholar
- 7.Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: \(l\)-diversity: privacy beyond \(k\)-anonymity. ACM Trans. Knowl. Discov. Data 1(1), 3 (2007)CrossRefGoogle Scholar
- 8.Hamza, N., Hefny, H.A.: Attacks on anonymization-based privacy-preserving: a survey for data mining and data publishing. Inf. Secur. 4(2), 101 (2013)Google Scholar
- 9.Sweeney, L.: \(k\)-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)MathSciNetCrossRefGoogle Scholar
- 10.Adult Data Set, UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Adult
- 11.Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of International Conference Data Mining, pp. 99–106 (2003)Google Scholar
- 12.Cormode, G., Srivastava, D., Li, N., Li, T.: Minimizing minimality and maximizing utility: analyzing method-based attacks on anonymized data. Proc. VLDB Endow. 3(1–2), 1045–1056 (2010)CrossRefGoogle Scholar
- 13.Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 517–526 (2009)Google Scholar
- 14.ARX – Data Anonymization tool a comprehensive software for privacy-preserving microdata publishing. http://arx.deidentifier.org/overview/metrics-for-information-loss/
- 15.Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288 (2002)Google Scholar