Advertisement

Knowledge and Information Systems

, Volume 54, Issue 2, pp 315–343 | Cite as

Community-preserving anonymization of graphs

  • François Rousseau
  • Jordi Casas-Roma
  • Michalis Vazirgiannis
Regular Paper
  • 284 Downloads

Abstract

In this paper, we propose a novel edge modification technique that better preserves the communities of a graph while anonymizing it. By maintaining the core number sequence of a graph, its coreness, we retain most of the information contained in the network while allowing changes in the degree sequence, i. e. obfuscating the visible data an attacker has access to. We reach a better trade-off between data privacy and data utility than with existing methods by capitalizing on the slack between apparent degree (node degree) and true degree (node core number). Our extensive experiments on six diverse standard network datasets support this claim. Our framework compares our method to other that are used as proxies for privacy protection in the relevant literature. We demonstrate that our method leads to higher data utility preservation, especially in clustering, for the same levels of randomization and k-anonymity.

Keywords

Privacy Data mining Graph algorithms Anonymization Social networks Core number sequence Graph degeneracy 

Notes

Acknowledgements

This work was partly funded by the Spanish MCYT and the FEDER funds under Grants TIN2011-27076-C03 “CO-PRIVACY” and TIN2014-57364-C2-2-R “SMARTGLACIS”.

References

  1. 1.
    Adamic LA, Glance N (2005) The political blogosphere and the 2004 U.S. election. In: Proceedings of the international workshop on link discovery, pp 36–43Google Scholar
  2. 2.
    Assam R, Hassani M, Brysch M, Seidl T (2014) (K, D)-core anonymity: structural anonymization of massive networks. In: Proceedings of the 26th international conference on scientific and statistical database management, pp 17:1–17:12Google Scholar
  3. 3.
    Backstrom L, Dwork C, Kleinberg, J (2007) Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th international conference on World Wide Web, pp 181–190Google Scholar
  4. 4.
    Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Baur M, Gaertler M, Görke R, Krug M, Wagner D (2007) Generating graphs with predefined k-core structure. In: Proceedings of the European conference on complex systemsGoogle Scholar
  6. 6.
    Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10)Google Scholar
  7. 7.
    Bollobás B (1978) Extremal graph theory. Academic Press, LondonzbMATHGoogle Scholar
  8. 8.
    Cai BJ, Wang HY, Zheng HR, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: Proceedings of the international conference on machine learning and cybernetics, pp 1849–1854Google Scholar
  9. 9.
    Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2007) A model of Internet topology using k-shell decomposition. Proc Natl Acad Sci USA 104(27):11,150–11,154CrossRefGoogle Scholar
  10. 10.
    Casas-Roma J (2014) Privacy-preserving and data utility in graph mining. Ph.D. thesis. Universitat Autònoma de BarcelonaGoogle Scholar
  11. 11.
    Casas-Roma J, Herrera-Joancomartí J, Torra, V (2013) An algorithm for k-degree anonymity on large networks. In: Proceedings of the IEEE international conference on advances on social networks analysis and mining, pp 671–675Google Scholar
  12. 12.
    Casas-Roma J, Herrera-Joancomartí J, Torra V (2014) Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst 44(3):507–528CrossRefGoogle Scholar
  13. 13.
    Casas-Roma J, Herrera-Joancomartí J, Torra V (2016) \(k\)-degree anonymity and edge selection: improving data utility in large networks. Knowl Inf Syst. doi: 10.1007/s10115-016-0947-7 Google Scholar
  14. 14.
    Casas-Roma J, Rousseau F (2015) Community-preserving generalization of social networks. In: Proceedings of the social media and risk ASONAM 2015 workshopGoogle Scholar
  15. 15.
    Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70:066,111CrossRefGoogle Scholar
  16. 16.
    Giatsidis C, Thilikos DM, Vazirgiannis M (2011) Evaluating cooperation in communities with the k-core structure. In: Proceedings of the IEEE international conference on advances in social networks analysis and mining, pp 87–93Google Scholar
  17. 17.
    Gleiser PM, Danon L (2003) Community structure in jazz. Adv Complex Syst 6(04):565–573CrossRefGoogle Scholar
  18. 18.
    Goltsev AV, Dorogovtsev SN, Mendes JFF (2006) k-core (bootstrap) percolation on complex networks: critical phenomena and nonlocal effects. Phys Rev E 73(5)Google Scholar
  19. 19.
    Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103CrossRefGoogle Scholar
  20. 20.
    Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114CrossRefGoogle Scholar
  21. 21.
    Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical Report No. 07-19. Computer Science Department, University of Massachusetts Amherst, AmherstGoogle Scholar
  22. 22.
    Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80:056,117CrossRefGoogle Scholar
  23. 23.
    Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5:1–5:39CrossRefGoogle Scholar
  24. 24.
    Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2:1–2:40CrossRefGoogle Scholar
  25. 25.
    Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 93–106Google Scholar
  26. 26.
    Malle B, Schrittwieser S, Kieseberg P, Holzinger A (2016) Privacy aware machine learning and the right to be forgotten. ERCIM News 107(10):22–23Google Scholar
  27. 27.
    Malliaros FD, Vazirgiannis M (2013) To stay or not to stay: modeling engagement dynamics in social graphs. In: Proceedings of the 22nd ACM international conference on Information and knowledge management, pp 469–478Google Scholar
  28. 28.
    Narayanan A, Shmatikov V (2009) De-anonymizing social networks. In: Proceedings of the 2009 30th IEEE symposium on security and privacy, pp 173–187Google Scholar
  29. 29.
    Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, vol 3733, pp 284–293Google Scholar
  30. 30.
    Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS USA 105(4):1118–1123CrossRefGoogle Scholar
  31. 31.
    Seidman SB (1983) Network structure and minimum degree. Soc Netw 5:269–287MathSciNetCrossRefGoogle Scholar
  32. 32.
    Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Wu X, Ying X, Liu K, Chen L (2010) A survey of privacy-preservation of graphs and social networks. In: Aggarwal CC, Wang H (eds) Managing and mining graph data, advances in database systems, vol 40, pp 421–453Google Scholar
  34. 34.
    Yahoo! Webscope: Yahoo! Instant Messenger friends connectivity graph, version 1.0 (2003). http://research.yahoo.com/Academic_Relations
  35. 35.
    Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM workshop on mining data semantics, pp 1–8Google Scholar
  36. 36.
    Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Comput Res Repos (CoRR) 42(1):181–213Google Scholar
  37. 37.
    Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy preserving social network publishing. In: Workshop on social network mining and analysis, pp 10:1–10:10Google Scholar
  38. 38.
    Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the SIAM international conference on data mining, pp 739–750Google Scholar
  39. 39.
    Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473CrossRefGoogle Scholar
  40. 40.
    Zhang K, Lo D, Lim EP, Prasetyo PK (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst 35(3):553–583CrossRefGoogle Scholar
  41. 41.
    Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the IEEE 24th international conference on data engineering, pp 506–515Google Scholar
  42. 42.
    Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2017

Authors and Affiliations

  1. 1.LIXÉcole PolytechniquePalaiseauFrance
  2. 2.Universitat Oberta de Catalunya (UOC)BarcelonaSpain

Personalised recommendations