Abstract
Individuals generate tremendous amount of personal data each day, with a wide variety of uses. This datum often contains sensitive information about individuals, which can be disclosed by “adversaries”. Even when direct identifiers such as social security numbers are masked, an adversary may be able to recognize an individual’s identity for a data record by looking at the values of quasi-identifiers (QIDs), known as identity disclosure, or can uncover sensitive attributes (SAs) about an individual through attribute disclosure. In data privacy field, multiple disclosure risk measures have been proposed. These share two drawbacks: they do not consider identity and attribute disclosure concurrently, and they consider a restrictive attack model by assuming certain attributes, namely QIDs and SAs. In this paper, we present a flexible adversary disclosure risk measure that addresses these limitations, by presenting a single combined metric of identity and attribute disclosure, and generalizing attack models by considering all scenarios for an adversary’s knowledge and disclosure targets while providing the flexibility to model a specific disclosure preference. We have developed an efficient algorithm for computing our proposed risk measure and evaluated the performance of our approach on a benchmark dataset from 1994 Census database.
Similar content being viewed by others
Research Data Policy and Data Availability Statements
The datasets generated during and/or analyzed during the current study are available in the UCI machine learning repository, https://archive.ics.uci.edu/ml/datasets/adult.
References
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzzin. Knowled. Based Syst. 10, 557–570 (2002)
Orooji, M., Knapp, G.M.: Improving suppression to reduce disclosure risk and enhance data utility. In: Proceedings of the 2018 IISE Annual Conference, pp 1415–1420 (2018)
Orooji, M., Knapp, G. M.: A novel microdata privacy disclosure risk measure. In: Proceedings of the 2018 IISE Annual Conference, pp 1397–1402 (2018)
Manning, A.M., Haglin, D.J., Keane, J.A.: A recursive search algorithm for statistical disclosure assessment. Data Min. Knowl. Disc. 16, 165–196 (2008)
Abril, D., Navarro-Arribas, G., Torra, V.: Improving record linkage with supervised learning for disclosure risk assessment. Inf. Fusion. 13, 274–284 (2012)
Abril, D., Navarro-Arribas, G., Torra, V.: Choquet integral for record linkage. Ann. Oper. Res. 195, 97–110 (2012)
Torra, V., Navarro-Arribas, G., Abril, D.: Supervised learning for record linkage through weighted means and OWA operators. Control. Cybern. 39, 1011–1026 (2010)
Abril, D., Torra, V., Navarro-Arribas, G.: Supervised learning using a symmetric bilinear form for record linkage. Inf. Fusion. 26, 144–153 (2015)
Muralidhar, K., Domingo-Ferrer, J.: Rank-based record linkage for re-identification risk assessment. In: International Conference on Privacy in Statistical Databases, pp. 225–236 (2016)
Domingo-Ferrer, J., Ricci, S., Soria-Comas, J.: Disclosure risk assessment via record linkage by a maximum-knowledge attacker. In: 2015 13th Annual Conference on Privacy, Security and Trust (PST), pp. 28–35 (2015)
Andreou, A., Goga, O., Loiseau, P.: Identity vs. attribute disclosure risks for users with multiple social profiles. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 163–170 (2017)
Nin, J., Herranz, J., Torra, V.: Using classification methods to evaluate attribute disclosure risk. In: Modeling Decisions for Artificial Intelligence, pp. 277–286 (2010)
Herranz, J., Matwin, S., Nin, J., Torra, V.: Classifying data from protected statistical datasets. Comput. Secur. 29, 875–890 (2010)
Torra, V.: Privacy models and disclosure risk measures. In: Data Privacy: Foundations, New Developments and the Big Data Challenge, pp. 111–189. Springer, Cham (2017)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1, 3 (2007)
Soria-Comas, J., Domingo-Ferrer, J., Sanchez, D., Martinez, S.: t-closeness through microaggregation: strict privacy with enhanced utility preservation. IEEE Trans. Knowl. Data Eng. 27, 3098–3110 (2015)
Motwani, R., Xu, Y.: Efficient algorithms for masking and finding quasi-identifiers. In: Proceedings of the Conference on Very Large Data Bases (VLDB), pp. 83–93 (2007)
El Emam, K., Dankar, F.: Re-identification risk in de-identified databases containing personal information. Google Patents (2012)
El Emam, K.: Risk-based de-identification of health data. IEEE Secur. Privacy 64–67 (2010)
Prasser, F., Kohlmayer, F.: Putting statistical disclosure control into practice: the ARX data anonymization tool. In: Medical Data Privacy Handbook, pp. 111–148. Springer (2015)
Cambria, E., White, B.: Jumping NLP curves: a review of natural language processing research. IEEE Comput. Intell. Mag. 9, 48–57 (2014)
Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25, 158–176 (2013)
Martínez, S., Valls, A., Sánchez, D.: An ontology-based record linkage method for textual microdata. In: CCIA, pp. 130–139 (2011)
Wang, H., Han, J., Wang, J., Wang, L.: (k, )-Anonymity: an anonymity model for thwarting similarity attack. In: 2013 IEEE International Conference on Granular Computing (GrC), pp. 332–337 (2013)
Wang, H., Han, J., Wang, J., Wang, L.: (l, e)-diversity-a privacy preserving model to resist semantic similarity attack. J. Comput. 9, 59–65 (2014)
Mubark, A.A., Elabd, E., Abdulkader, H.: Semantic anonymization in publishing categorical sensitive attributes. In: 2016 8th International Conference on Knowledge and Smart Technology (KST), pp. 89–95 (2016)
Balsa, E., Troncoso, C., Diaz, C.: A metric to evaluate interaction obfuscation in online social networks. Int. J. Uncertain. Fuzzin. Knowl. Based Syst. 20, 877–892 (2012)
Aghasian, E., Garg, S., Gao, L., Yu, S., Montgomery, J.: Scoring users’ privacy disclosure across multiple online social networks. IEEE Access 5, 13118–13130 (2017)
Wang, Q., Zhang, Y., Lu, X., Wang, Z., Qin, Z., Ren, K.: Real-time and spatio-temporal crowd-sourced social network data publishing with differential privacy. IEEE Trans. Depend. Secure Comput. (2016)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Theory of Cryptography Conference, pp. 265–284 (2006)
Shin, H., Kim, S., Shin, J., Xiao, X.: Privacy enhanced matrix factorization for recommendation with local differential privacy. IEEE Trans. Knowl. Data Eng. 30, 1770–1782 (2018)
Cheng, X., Su, S., Xu, S., Xiong, L., Xiao, K., Zhao, M.: A two-phase algorithm for differentially private frequent subgraph mining. IEEE Trans. Knowl. Data Eng. 30, 1411–1425 (2018)
Xiong, X., Chen, F., Huang, P., Tian, M., Hu, X., Chen, B., Qin, J.: Frequent itemsets mining with differential privacy over large-scale data. IEEE Access 6, 28877–28889 (2018)
Ni, L., Li, C., Wang, X., Jiang, H., Yu, J.: DP-MCDBSCAN: differential privacy preserving multi-core DBSCAN clustering for network user data. IEEE Access 6, 21053–21063 (2018)
Li, C., Zhou, P., Xiong, L., Wang, Q., Wang, T.: Differentially private distributed online learning. IEEE Trans. Knowl. Data Eng. 30, 1440–1453 (2018)
Du, M., Wang, K., Xia, Z., Zhang, Y.: Differential privacy preserving of training model in wireless big data with edge computing. IEEE Trans. Big Data 1–1 (2018)
Soria-Comas, J., Domingo-Ferrer, J.: Big data privacy: challenges to privacy principles and models. Data Sci. Eng. 1, 21–28 (2016)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Marmar Orooji declares that she has no conflict of interest. Seyedeh Shaghayegh Rabbanian declares that she has no conflict of interest. Gerald M. Knapp declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Orooji, M., Rabbanian, S.S. & Knapp, G.M. Flexible adversary disclosure risk measure for identity and attribute disclosure attacks. Int. J. Inf. Secur. 22, 631–645 (2023). https://doi.org/10.1007/s10207-022-00654-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-022-00654-y