Abstract
In this paper, we propose a novel edge modification technique that better preserves the communities of a graph while anonymizing it. By maintaining the core number sequence of a graph, its coreness, we retain most of the information contained in the network while allowing changes in the degree sequence, i. e. obfuscating the visible data an attacker has access to. We reach a better trade-off between data privacy and data utility than with existing methods by capitalizing on the slack between apparent degree (node degree) and true degree (node core number). Our extensive experiments on six diverse standard network datasets support this claim. Our framework compares our method to other that are used as proxies for privacy protection in the relevant literature. We demonstrate that our method leads to higher data utility preservation, especially in clustering, for the same levels of randomization and k-anonymity.
Similar content being viewed by others
Notes
A preliminary version of this work appeared in the PhD thesis of one of the authors [10].
In cell biology, the nuclear lamina is a dense fibrillar network that surrounds the nucleus, gives it its shape and stabilizes the nuclear membrane.
References
Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. election. In: Proceedings of the international workshop on link discovery, pp 36–43
Assam R, Hassani M, Brysch M, Seidl T (2014) (K, D)-core anonymity: structural anonymization of massive networks. In: Proceedings of the 26th international conference on scientific and statistical database management, pp 17:1–17:12
Backstrom L, Dwork C, Kleinberg, J (2007) Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th international conference on World Wide Web, pp 181–190
Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145
Baur M, Gaertler M, Görke R, Krug M, Wagner D (2007) Generating graphs with predefined k-core structure. In: Proceedings of the European conference on complex systems
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10)
Bollobás B (1978) Extremal graph theory. Academic Press, London
Cai BJ, Wang HY, Zheng HR, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: Proceedings of the international conference on machine learning and cybernetics, pp 1849–1854
Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2007) A model of Internet topology using k-shell decomposition. Proc Natl Acad Sci USA 104(27):11,150–11,154
Casas-Roma J (2014) Privacy-preserving and data utility in graph mining. Ph.D. thesis. Universitat Autònoma de Barcelona
Casas-Roma J, Herrera-Joancomartí J, Torra, V (2013) An algorithm for k-degree anonymity on large networks. In: Proceedings of the IEEE international conference on advances on social networks analysis and mining, pp 671–675
Casas-Roma J, Herrera-Joancomartí J, Torra V (2014) Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst 44(3):507–528
Casas-Roma J, Herrera-Joancomartí J, Torra V (2016) \(k\)-degree anonymity and edge selection: improving data utility in large networks. Knowl Inf Syst. doi:10.1007/s10115-016-0947-7
Casas-Roma J, Rousseau F (2015) Community-preserving generalization of social networks. In: Proceedings of the social media and risk ASONAM 2015 workshop
Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066,111
Giatsidis C, Thilikos DM, Vazirgiannis M (2011) Evaluating cooperation in communities with the k-core structure. In: Proceedings of the IEEE international conference on advances in social networks analysis and mining, pp 87–93
Gleiser PM, Danon L (2003) Community structure in jazz. Adv Complex Syst 6(04):565–573
Goltsev AV, Dorogovtsev SN, Mendes JFF (2006) k-core (bootstrap) percolation on complex networks: critical phenomena and nonlocal effects. Phys Rev E 73(5)
Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103
Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical Report No. 07-19. Computer Science Department, University of Massachusetts Amherst, Amherst
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80:056,117
Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5:1–5:39
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2:1–2:40
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 93–106
Malle B, Schrittwieser S, Kieseberg P, Holzinger A (2016) Privacy aware machine learning and the right to be forgotten. ERCIM News 107(10):22–23
Malliaros FD, Vazirgiannis M (2013) To stay or not to stay: modeling engagement dynamics in social graphs. In: Proceedings of the 22nd ACM international conference on Information and knowledge management, pp 469–478
Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE symposium on security and privacy, pp 173–187
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, vol 3733, pp 284–293
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS USA 105(4):1118–1123
Seidman SB (1983) Network structure and minimum degree. Soc Netw 5:269–287
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Wu X, Ying X, Liu K, Chen L (2010) A survey of privacy-preservation of graphs and social networks. In: Aggarwal CC, Wang H (eds) Managing and mining graph data, advances in database systems, vol 40, pp 421–453
Yahoo! Webscope: Yahoo! Instant Messenger friends connectivity graph, version 1.0 (2003). http://research.yahoo.com/Academic_Relations
Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM workshop on mining data semantics, pp 1–8
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Comput Res Repos (CoRR) 42(1):181–213
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy preserving social network publishing. In: Workshop on social network mining and analysis, pp 10:1–10:10
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the SIAM international conference on data mining, pp 739–750
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Zhang K, Lo D, Lim EP, Prasetyo PK (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst 35(3):553–583
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the IEEE 24th international conference on data engineering, pp 506–515
Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957
Acknowledgements
This work was partly funded by the Spanish MCYT and the FEDER funds under Grants TIN2011-27076-C03 “CO-PRIVACY” and TIN2014-57364-C2-2-R “SMARTGLACIS”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rousseau, F., Casas-Roma, J. & Vazirgiannis, M. Community-preserving anonymization of graphs. Knowl Inf Syst 54, 315–343 (2018). https://doi.org/10.1007/s10115-017-1064-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1064-y