Abstract
Noise is added by privacy-preserving methods or anonymization processes to prevent adversaries from re-identifying users in anonymous networks. The noise introduced by the anonymization steps may also affect the data, reducing its utility for subsequent data mining processes. Graph modification approaches are one of the most used and well-known methods to protect the privacy of the data. These methods convert the data by means of vertex and edge modifications before releasing the perturbed data. In this paper we want to analyze the vertex and edge modification techniques found in literature covering this topic. We empirically evaluate the information loss introduced by each of these methods not only using generic metrics related to graph properties, but also using some specific metrics related to real graph-mining tasks. We want to point out how these methods affect the main properties and characteristics of the network, since it will help us to choose the best one to achieve a desired privacy level while preserving data utility.
Similar content being viewed by others
Notes
A preliminary, short version of this paper appeared at MDAI 2015 (Casas-Roma 2015).
Available at: http://igraph.org/.
References
Adamic Lada A, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. In: LinkKDD ’05: proceedings of the 3rd international workshop on Link discovery, pp 36–43
Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In: International conference on world wide web (WWW), pages 181–190, New York, NY, USA. ACM Press. https://doi.org/10.1145/1242572.1242598
Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech 2008(10):P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
Bonchi F, Gionis A, Tassa T (2011) Identity obfuscation in graphs through the information theoretic lens. In: 2011 IEEE 27th international conference on data engineering, pages 924–935, Washington, DC, USA. IEEE Computer Society. https://doi.org/10.1109/ICDE.2011.5767905
Bonchi Francesco, Gionis Aristides, Tassa Tamir (2014) Identity obfuscation in graphs through the information theoretic lens. Inform Sci 275:232–256. https://doi.org/10.1016/j.ins.2014.02.035
Cai B-J, Wang H-Y, Zheng H-R, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: International conference on machine learning and cybernetics (ICMLC), pp 1849–1854, Qingdao, China. IEEE Computer Society. https://doi.org/10.1109/ICMLC.2010.5580953
Casas-Roma J (2014) Privacy-preserving on graphs using randomization and edge-relevance. In: Vicenç T (ed) International conference on modeling decisions for artificial intelligence (MDAI). Springer International Publishing, Tokyo, pp 204–216
Casas-Roma J (2015) An evaluation of edge modification techniques for privacy-preserving on graphs. In: Vicenc T, N Torra (eds) International conference on modeling decisions for artificial intelligence (MDAI), lecture notes in computer science. Springer International Publishing, Skövde, pp 180–191. https://doi.org/10.1007/978-3-319-23240-9
Casas-Roma J, Herrera-Joancomartà J, Torra V (2013) An algorithm for k-degree anonymity on large networks. In: IEEE international conference on advances on social networks analysis and mining (ASONAM), pp 671–675, Niagara Falls, CA. IEEE Computer Society
Casas-Roma J, Herrera-Joancomartà J, Torra V (2014) Anonymizing graphs: measuring quality for clustering. Knowl Inform Syst (KAIS) 44(3):507–528. https://doi.org/10.1007/s10115-014-0774-7
Casas-Roma Jordi, Herrera-Joancomartà Jordi, Torra Vicenç (2017a) A survey of graph-modification techniques for privacy-preserving on networks. Artif Intell Rev 47(3):341–366. https://doi.org/10.1007/s10462-016-9484-8
Casas-Roma Jordi, Herrera-Joancomartà Jordi, Torra Vicenç (2017b) k-degree anonymity and edge selection: improving data utility in large networks. Knowl Inform Syst 50(2):447–474. https://doi.org/10.1007/s10115-016-0947-7
Chakraborty S, Ambooken JG, Tripathy BK, Purushotham S (2015) Analysis and performance enhancement to achieve recursive (c, l) diversity anonymization in social networks. Trans Data Priv (TDP) 8(2):173–215
Cheng J, Wai-chee FA, Liu J (2010) K-isomorphism: privacy preserving network publication against structural attacks. In: International conference on management of data (SIGMOD), pp 459–470, New York, New York, USA. ACM Press. https://doi.org/10.1145/1807167.1807218
Chester S, Kapron BM, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2011) k-Anonymization of social networks by vertex addition. In: Eder J, Bieliková M, Tjoa AM (eds) ADBIS 2011, Research communications, Proceedings II of the 15th East-European conference on advances in databases and information systems, September 20–23, 2011, CEUR workshop proceedings 789, CEUR-WS.org 2011. Vienna, Austria
Chester S, Gaertner J, Stege U, Venkatesh S (2012) Anonymizing subsets of social networks with degree constrained subgraphs. In: IEEE international conference on advances on social networks analysis and mining (ASONAM), pp 418–422, Washington, DC, USA. IEEE Computer Society. https://doi.org/10.1109/ASONAM.2012.74
Chester S, Kapron Bruce M, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2013) Why Waldo befriended the dummy? k-anonymization of social networks with pseudo-nodes. Soc Netw Anal Min 3(3):381–399. https://doi.org/10.1007/s13278-012-0084-6
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111. https://doi.org/10.1103/PhysRevE.70.066111
Dwork C (2006) Differential privacy. In: Michele B, Bart P, Vladimiro S, Ingo W (eds) International colloquium on automata, languages, and programming (ICALP), vol 4052. Springer-Verlag, Berlin, Heidelberg, pp 1–12. https://doi.org/10.1007/11787006_1
Erdös P, Rényi A (1959) On random graphs i. Publ Math Debr 6:290
Ferri Fernando, Grifoni Patrizia, Guzzo Tiziana (2012) New forms of social and professional digital relationships: the case of facebook. Soc Netw Anal Min (SNAM) 2(2):121–137. https://doi.org/10.1007/s13278-011-0038-4
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA (PNAS) 99(12):7821–7826. https://doi.org/10.1073/pnas.122653799
Gleiser Pablo, Danon Leon (2003) Community structure in jazz. Adv Complex Syst 6(4):565–573
GuimerĂ R, Danon L, DĂaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103
Hanhijärvi S, Garriga GC, Puolamäki K (2009) Randomization techniques for graphs. In: SIAM conference on data mining (SDM), pp 780–791, Sparks, Nevada, USA. SIAM
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical Report No. 07-19, UMass Amherst
Hay Michael, Miklau Gerome, Jensen David, Towsley Don, Weis Philipp (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114
Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: IEEE international conference on data mining (ICDM), pp 169–178, Miami, FL. IEEE Computer Society. https://doi.org/10.1109/ICDM.2009.11
Hay M, Liu K, Miklau G, Pei J, Terzi E (2011) Privacy-aware data management in information networks. In: International conference on management of data (SIGMOD), pp 1201–1204, New York, New York, USA. ACM Press. https://doi.org/10.1145/1989323.1989453
Kapron BM, Srivastava G, Venkatesh S (2011) Social network anonymization via edge addition. In: IEEE international conference on advances on social networks analysis and mining (ASONAM), pp 155–162, Kaohsiung. IEEE Computer Society. https://doi.org/10.1109/ASONAM.2011.108
Lancichinetti Andrea, Fortunato Santo (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):56117
Leskovec J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web, pp 641–650. ACM
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In ACM SIGMOD international conference on management of data, pp 93–106, New York, NY, USA. ACM Press. https://doi.org/10.1145/1376616.1376629
Xuesong Lu, Yi Song, Bressan Stéphane (2012) Fast identity anonymization on graphs. In: 23rd international conference on database and expert systems applications (DEXA ’12), pp 281–295, Vienna, Austria. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-32600-4_21
Nagle F (2013) Privacy breach analysis in social networks. In: Tansel Ö, Zeki E, Suheil K (eds) Mining social networks and security informatics. Springer, Dordrecht, pp 63–77. https://doi.org/10.1007/978-94-007-6359-3
Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: Bringing order to the web. In: Proceedings of the 7th international world wide web conference, pp 161–172, Brisbane, Australia
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: Yolum, Güngör T, Gürgen F, Özturan C (eds) Computer and information sciences - ISCIS 2005. ISCIS 2005. Lecture notes in computer science, vol 3733. Springer, Berlin, Heidelberg
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123. https://doi.org/10.1073/pnas.0706851105
Stokes Klara, Torra Vicenç (2012) Reidentification and k-anonymity: a model for disclosure risk in graphs. Soft Comput 16(10):1657–1670. https://doi.org/10.1007/s00500-012-0850-4
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst (IJUFKS) 10(5):557–570. https://doi.org/10.1142/S0218488502001648
Tripathy BK, Panda GK (2010) A new approach to manage security against neighborhood attacks in social networks. In: IEEE international conference on advances on social networks analysis and mining (ASONAM), pp 264–269, Odense, Denmark. IEEE. https://doi.org/10.1109/ASONAM.2010.69
Watts Duncan J, Strogatz Steven H (1998) Collective dynamics of ’small-world’ networks. Nature 393(6684):440–442. https://doi.org/10.1038/30918
Wu Wentao, Xiao Yanghua, Wang Wei, He Zhenying, Wang Zhihui (2010a) K-symmetry model for identity anonymization in social networks. In: International conference on extending database technology (EDBT), pp 111–122, Lausanne, Switzerland, 2010a. ACM Press. https://doi.org/10.1145/1739041.1739058
Wu X, Ying X, Liu K, Chen L (2010b) A survey of privacy-preservation of graphs and social networks. Springer, Boston, pp 421–453. https://doi.org/10.1007/978-1-4419-6045-0_14
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: SIAM conference on data mining (SDM), pp 739–750, Atlanta, Georgia, USA. SIAM
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and K-degree anonymization schemes for privacy preserving social network publishing. In: Proceedings of the 3rd workshop on social network mining and analysis. ACM Press, Paris, France, pp 10:1–10:10. https://doi.org/10.1145/1731011.1731021
Yuan M, Chen L, Yu PS, Yu T (2013) Protecting sensitive labels in social network data anonymization. IEEE Trans Knowl Data Eng (TKDE) 25(3):633–647. https://doi.org/10.1109/TKDE.2011.259
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Zhang K, Lo D, Lim E-P, Prasetyo PK (2013) Mining indirect antagonistic communities from social interactions. Knowl Inform Syst 35(3):553–583. https://doi.org/10.1007/s10115-012-0519-4
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: IEEE international conference on data engineering (ICDE), pp 506–515, Washington, DC, USA, 2008. IEEE Computer Society. https://doi.org/10.1109/ICDE.2008.4497459
Zou L, Chen L, Tamer Özsu M (2009) K-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957
Acknowledgements
This work was supported by the Spanish Government, in part under Grant RTI2018-095094-B-C22 “CONSENT”, and in part under Grant TIN2014-57364-C2-2-R “SMARTGLACIS”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Casas-Roma, J. An evaluation of vertex and edge modification techniques for privacy-preserving on graphs. J Ambient Intell Human Comput 14, 15109–15125 (2023). https://doi.org/10.1007/s12652-019-01363-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01363-6