Social Network Analysis and Mining

, Volume 3, Issue 3, pp 381–399 | Cite as

Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes

  • Sean Chester
  • Bruce M. Kapron
  • Ganesh Ramesh
  • Gautam Srivastava
  • Alex Thomo
  • S. Venkatesh
Original Article


For a graph-based representation of a social network, the identity of participants can be uniquely determined if an adversary has background structural knowledge about the graph. We focus on degree-based attacks, wherein the adversary knows the degrees of particular target vertices and we aim to protect the anonymity of participants through k-anonymization, which ensures that every participant is equivalent to at least k − 1 other participants with respect to degree. We introduce a natural and novel approach of introducing “dummy” participants into the network and linking them to each other and to real participants in order to achieve this anonymity. The advantage of our approach lies in the nature of the results that we derive. We show that if participants have labels associated with them, the problem of anonymizing a subset of participants is NP-Complete. On the other hand, in the absence of labels, we give an \(\mathcal{O}(nk)\) algorithm to optimally k-anonymize a subset of participants or to near-optimally k-anonymize all real and all dummy participants. For degree-based-attacks, such theoretical guarantees are novel.


Privacy k-Anonymization Social networks Complexity Dynamic programming 


  1. Adamic L, Glance N (2005) The political blogosphere and the 2004 u.s. election: divided they blog. In: Proceedings of WWW 2005 workshop on the weblogging ecosystemGoogle Scholar
  2. Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: Proceedings of international conference on database theory (ICDT), pp 246–258Google Scholar
  3. Akiyama J, Era H, Harary F (1983) Regular graphs containing a given graph. Am Math Month 83:15–17MathSciNetGoogle Scholar
  4. Backstrom L, Dwork C, Kleinberg JM (2007) Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of conference on world wide web (WWW), pp 181–190Google Scholar
  5. Barrat A, Weigt M (2000) On the properties of small-world network models. Eur Phys J B 13(3):547–560Google Scholar
  6. Bodlaender HL, Tan RB, van Leeuwen J (2000) Finding a delta-regular supergraph of minimum order. Tech Rep UU-CS-2000-29, Dept of Computer Science, Utrecht University, UtrechtGoogle Scholar
  7. Chakrabarti, D., Faloutsos, C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2. doi: 10.1145/1132952.1132954 Google Scholar
  8. Cheng J, Fu AWC, Liu J (2010) K-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of ACM Special Interest Group on Management of Data (SIGMOD), pp 459–470Google Scholar
  9. Chester S, Srivastava G (2011) Social network privacy for attribute disclosure attacks. In: Proceedings of advances in social networks analysis and mining (ASONAM)Google Scholar
  10. Chester S, Kapron B, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2011) k-anonymization of social networks by vertex addition. In: Proceedings of advances in databases and information systems (ADBIS)Google Scholar
  11. Chester S, Gaertner J, Stege U, Venkatesh S (2012a) Anonymizing subsets of social networks with degree constrained subgraphs. In: Proceedings of advances in social networks analysis and mining (ASONAM)Google Scholar
  12. Chester S, Kapron B, Srivastava G, Venkatesh S (2012b) Complexity of social network anonymization. Soc Netw Anal Min. doi: 10.1007/s13278-012-0059-7
  13. Costa LdF, Rodrigues FA, Travieso G, Villas Boas PR (2007) Characterization of complex networks: a survey of measurements. Adv Phys 56:167–242CrossRefGoogle Scholar
  14. Domingo-Ferrer J (ed) (2002) Inference Control in statistical databases, from theory to practice. In: Lecture Notes in Computer Science, vol 2316. Springer, BerlinGoogle Scholar
  15. Dwork C (2006) Differential privacy. In: ICALP. Springer, Berlin, pp 1–12Google Scholar
  16. Erdős P, Kelly P (1967) The minimal regular graph containing a given graph. Am Math Month 70:1074–1075CrossRefGoogle Scholar
  17. Estrada E, Rodriguez-Velazquez JA (2005) Spectral measures of bipartivity in complex networks. Phys Rev E 72(4):046105. doi: 10.1103/PhysRevE.72.046105 Google Scholar
  18. Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. SIGCOMM Comput Commun Rev 29(4):251–262. doi: 10.1145/316194.316229 CrossRefGoogle Scholar
  19. Ferri F, Grifoni P, Guzzo T (2012) New forms of social and professional digital relationships: the case of facebook. Soc Netw Anal Min 2(2):121–137CrossRefGoogle Scholar
  20. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99:7821–7826MathSciNetzbMATHCrossRefGoogle Scholar
  21. González JJS (2002) Extending cell suppression to protect tabular data against several attackers. In: Inference Control in Statistical Databases, pp 34–58Google Scholar
  22. Hay M, Miklau G, Jensen D, Towsley DF, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc Very Large Datab 1(1):102–114Google Scholar
  23. Heer J (2005) Prefuse: a toolkit for interactive information visualization. In: CHI 05: Proceedings of the SIGCHI conference on human factors in computing systems. ACM Press, New York, pp 421–430Google Scholar
  24. König D (1936) Akademische verlagsgesellschaft. LeipzigGoogle Scholar
  25. Latora V, Marchiori M (2001) Efficient behavior of small-world networks. Phys Rev Lett 87. doi: 10.1103/PhysRevLett.87.198701
  26. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: Densification laws, shrinking diameters and possible explanations. In: Proceedings of international conference on knowledge discovery and data mining (KDD)Google Scholar
  27. Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Proceedings of conference on world wide web (WWW), pp 695–704Google Scholar
  28. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of of IEEE 23rd international conference on data engineering (ICDE07)Google Scholar
  29. Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of ACM Special Interest Group on Management of Data (SIGMOD), pp 93–106Google Scholar
  30. Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1). doi: 10.1145/1217299.1217302
  31. McSherry F, Mironov I (2009) Differentially private recommender systems: building privacy into the netflix prize contenders. In: Proceedings of international conference on knowledge discovery and data mining (KDD), pp 627–636Google Scholar
  32. Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Principles of database systems, pp 223–228Google Scholar
  33. Milgram S (1967) The small world problem. Psychol Today 2:60–67Google Scholar
  34. Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3). doi: 10.1103/PhysRevE.74.036104
  35. Robertson DA, Ethier R (2002) Cell suppression: experience and theory. In: Inference control in statistical databases, pp 8–20Google Scholar
  36. Sweeney L (2002) k-anonymity: A model for protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570MathSciNetzbMATHCrossRefGoogle Scholar
  37. Thompson B, Yao D (2009) The union-split algorithm and cluster-based anonymization of social networks. In: Proceedings of ACM symposium on information, computer and communications security (ASIACCS), pp 218–227Google Scholar
  38. Wang Y, Xie L, Zheng B, Lee KCK (2011) Utility-oriented k-anonymization on social networks. In: Proceedings of the 16th international conference on Database systems for advanced applications, vol Part I, DASFAA’11. Springer, Berlin, pp 78–92Google Scholar
  39. Wu W, Xiao Y, Wang W, He Z, Wang Z (2010) k-symmetry model for identity anonymization in social networks. In: Proceedings of international conference on extending database technology (EDBT), pp 111–122Google Scholar
  40. Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and k-degree anonymization schemes for privacy preserving social network publishing. In: Proceedings of 3rd workshop on social network mining and analysis (SNA-KDD). ACM, New York, pp 10:1–10:10Google Scholar
  41. Yuan M, Chen L, Yu PS (2010) Personalized privacy protection in social networks. Proc Very Large Datab 4(2):141–150Google Scholar
  42. Zheleva E, Getoor L (2007) Preserving the privacy of sensitive relationships in graph data. In: Proceedings of privacy, security, and trust in KDD (PinKDD), pp 153–171Google Scholar
  43. Zhou B, Pei J (2011) The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowledge Information Systems 28(1):47–77MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Sean Chester
    • 1
  • Bruce M. Kapron
    • 1
  • Ganesh Ramesh
    • 2
  • Gautam Srivastava
    • 1
  • Alex Thomo
    • 1
  • S. Venkatesh
    • 1
  1. 1.Department of Computer ScienceUniversity of VictoriaVictoriaCanada
  2. 2.Yahoo! Inc.Santa ClaraUSA

Personalised recommendations