Skip to main content
Log in

Community-preserving anonymization of graphs

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel edge modification technique that better preserves the communities of a graph while anonymizing it. By maintaining the core number sequence of a graph, its coreness, we retain most of the information contained in the network while allowing changes in the degree sequence, i. e. obfuscating the visible data an attacker has access to. We reach a better trade-off between data privacy and data utility than with existing methods by capitalizing on the slack between apparent degree (node degree) and true degree (node core number). Our extensive experiments on six diverse standard network datasets support this claim. Our framework compares our method to other that are used as proxies for privacy protection in the relevant literature. We demonstrate that our method leads to higher data utility preservation, especially in clustering, for the same levels of randomization and k-anonymity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. A preliminary version of this work appeared in the PhD thesis of one of the authors [10].

  2. In cell biology, the nuclear lamina is a dense fibrillar network that surrounds the nucleus, gives it its shape and stabilizes the nuclear membrane.

  3. http://igraph.org/python/doc/igraph.GraphBase-class.html.

References

  1. Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. election. In: Proceedings of the international workshop on link discovery, pp 36–43

  2. Assam R, Hassani M, Brysch M, Seidl T (2014) (K, D)-core anonymity: structural anonymization of massive networks. In: Proceedings of the 26th international conference on scientific and statistical database management, pp 17:1–17:12

  3. Backstrom L, Dwork C, Kleinberg, J (2007) Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th international conference on World Wide Web, pp 181–190

  4. Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145

    Article  MathSciNet  MATH  Google Scholar 

  5. Baur M, Gaertler M, Görke R, Krug M, Wagner D (2007) Generating graphs with predefined k-core structure. In: Proceedings of the European conference on complex systems

  6. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10)

  7. Bollobás B (1978) Extremal graph theory. Academic Press, London

    MATH  Google Scholar 

  8. Cai BJ, Wang HY, Zheng HR, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: Proceedings of the international conference on machine learning and cybernetics, pp 1849–1854

  9. Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2007) A model of Internet topology using k-shell decomposition. Proc Natl Acad Sci USA 104(27):11,150–11,154

    Article  Google Scholar 

  10. Casas-Roma J (2014) Privacy-preserving and data utility in graph mining. Ph.D. thesis. Universitat Autònoma de Barcelona

  11. Casas-Roma J, Herrera-Joancomartí J, Torra, V (2013) An algorithm for k-degree anonymity on large networks. In: Proceedings of the IEEE international conference on advances on social networks analysis and mining, pp 671–675

  12. Casas-Roma J, Herrera-Joancomartí J, Torra V (2014) Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst 44(3):507–528

    Article  Google Scholar 

  13. Casas-Roma J, Herrera-Joancomartí J, Torra V (2016) \(k\)-degree anonymity and edge selection: improving data utility in large networks. Knowl Inf Syst. doi:10.1007/s10115-016-0947-7

    Google Scholar 

  14. Casas-Roma J, Rousseau F (2015) Community-preserving generalization of social networks. In: Proceedings of the social media and risk ASONAM 2015 workshop

  15. Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066,111

    Article  Google Scholar 

  16. Giatsidis C, Thilikos DM, Vazirgiannis M (2011) Evaluating cooperation in communities with the k-core structure. In: Proceedings of the IEEE international conference on advances in social networks analysis and mining, pp 87–93

  17. Gleiser PM, Danon L (2003) Community structure in jazz. Adv Complex Syst 6(04):565–573

    Article  Google Scholar 

  18. Goltsev AV, Dorogovtsev SN, Mendes JFF (2006) k-core (bootstrap) percolation on complex networks: critical phenomena and nonlocal effects. Phys Rev E 73(5)

  19. Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103

    Article  Google Scholar 

  20. Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114

    Article  Google Scholar 

  21. Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical Report No. 07-19. Computer Science Department, University of Massachusetts Amherst, Amherst

  22. Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80:056,117

    Article  Google Scholar 

  23. Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5:1–5:39

    Article  Google Scholar 

  24. Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2:1–2:40

    Article  Google Scholar 

  25. Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 93–106

  26. Malle B, Schrittwieser S, Kieseberg P, Holzinger A (2016) Privacy aware machine learning and the right to be forgotten. ERCIM News 107(10):22–23

    Google Scholar 

  27. Malliaros FD, Vazirgiannis M (2013) To stay or not to stay: modeling engagement dynamics in social graphs. In: Proceedings of the 22nd ACM international conference on Information and knowledge management, pp 469–478

  28. Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE symposium on security and privacy, pp 173–187

  29. Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, vol 3733, pp 284–293

  30. Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS USA 105(4):1118–1123

    Article  Google Scholar 

  31. Seidman SB (1983) Network structure and minimum degree. Soc Netw 5:269–287

    Article  MathSciNet  Google Scholar 

  32. Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570

    Article  MathSciNet  MATH  Google Scholar 

  33. Wu X, Ying X, Liu K, Chen L (2010) A survey of privacy-preservation of graphs and social networks. In: Aggarwal CC, Wang H (eds) Managing and mining graph data, advances in database systems, vol 40, pp 421–453

  34. Yahoo! Webscope: Yahoo! Instant Messenger friends connectivity graph, version 1.0 (2003). http://research.yahoo.com/Academic_Relations

  35. Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM workshop on mining data semantics, pp 1–8

  36. Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Comput Res Repos (CoRR) 42(1):181–213

    Google Scholar 

  37. Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy preserving social network publishing. In: Workshop on social network mining and analysis, pp 10:1–10:10

  38. Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the SIAM international conference on data mining, pp 739–750

  39. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473

    Article  Google Scholar 

  40. Zhang K, Lo D, Lim EP, Prasetyo PK (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst 35(3):553–583

    Article  Google Scholar 

  41. Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the IEEE 24th international conference on data engineering, pp 506–515

  42. Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957

    Article  Google Scholar 

Download references

Acknowledgements

This work was partly funded by the Spanish MCYT and the FEDER funds under Grants TIN2011-27076-C03 “CO-PRIVACY” and TIN2014-57364-C2-2-R “SMARTGLACIS”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jordi Casas-Roma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rousseau, F., Casas-Roma, J. & Vazirgiannis, M. Community-preserving anonymization of graphs. Knowl Inf Syst 54, 315–343 (2018). https://doi.org/10.1007/s10115-017-1064-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1064-y

Keywords

Navigation