Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes

Chester, Sean; Kapron, Bruce M.; Ramesh, Ganesh; Srivastava, Gautam; Thomo, Alex; Venkatesh, S.

doi:10.1007/s13278-012-0084-6

Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes

Original Article
Published: 26 September 2012

Volume 3, pages 381–399, (2013)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Sean Chester¹,
Bruce M. Kapron¹,
Ganesh Ramesh²,
Gautam Srivastava¹,
Alex Thomo¹ &
…
S. Venkatesh¹

490 Accesses
46 Citations
Explore all metrics

Abstract

For a graph-based representation of a social network, the identity of participants can be uniquely determined if an adversary has background structural knowledge about the graph. We focus on degree-based attacks, wherein the adversary knows the degrees of particular target vertices and we aim to protect the anonymity of participants through k-anonymization, which ensures that every participant is equivalent to at least k − 1 other participants with respect to degree. We introduce a natural and novel approach of introducing “dummy” participants into the network and linking them to each other and to real participants in order to achieve this anonymity. The advantage of our approach lies in the nature of the results that we derive. We show that if participants have labels associated with them, the problem of anonymizing a subset of participants is NP-Complete. On the other hand, in the absence of labels, we give an $\mathcal{O}(nk)$ algorithm to optimally k-anonymize a subset of participants or to near-optimally k-anonymize all real and all dummy participants. For degree-based-attacks, such theoretical guarantees are novel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Conditional adjacency anonymity in social graphs under active attacks

Article Open access 27 December 2018

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

Improved Upper and Lower Bound Heuristics for Degree Anonymization in Social Networks

Notes

We define a vertex-labelled graph as the four-tuple $(\hbox{V,E},\Upsigma,\ell)$, where V is a vertex set, $\hbox{E}\subseteq \hbox{V}\times\hbox{V}$ is a set of undirected edges, $\Upsigma$ is a set of sensitive labels, and $ \ell:\hbox{V}\mapsto\Upsigma$ is a labelling function that assigns a label to each vertex. We discuss in the paper two types of labels, sensitive and identifying. By $\Upsigma$, we refer to the former, assuming the latter is stripped from the graph.
We mention specific cases in which these questions have been answered in our discussion of related work in Sect. 6 Even in these cases, however, not all three questions have been fully addressed.
Precise formulations of the problem appear in Sect. 2 for unlabelled graphs and in Sect. 5 for labelled graphs.
For simplicity in this section, we regard a graph as a 2-tuple. We note that equivalently, for consistency, we could express an unlabelled graph as $\mathcal{G}=(\hbox{V, E},\Upsigma,\ell)$ where $\exists \sigma\in\Upsigma: \forall v\in\hbox{V}, {\ell}(v)=\sigma$. However, the simpler notation simplifies the exposition.
Considering the Enron email corpus on which we experiment in Sect. 4.1, |V| > 65,000, but only 151 vertices correspond to internal email addresses.
http://snap.stanford.edu/data/.
http://www-personal.umich.edu/mejn/netdata/.
http://www.casos.cs.cmu.edu/computational_tools/datasets/external/polblogs/index11.php.
Recall that a walk is any sequence of adjacent edges, including those which revisit edges and/or vertices.

References

Adamic L, Glance N (2005) The political blogosphere and the 2004 u.s. election: divided they blog. In: Proceedings of WWW 2005 workshop on the weblogging ecosystem
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: Proceedings of international conference on database theory (ICDT), pp 246–258
Akiyama J, Era H, Harary F (1983) Regular graphs containing a given graph. Am Math Month 83:15–17
MathSciNet Google Scholar
Backstrom L, Dwork C, Kleinberg JM (2007) Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of conference on world wide web (WWW), pp 181–190
Barrat A, Weigt M (2000) On the properties of small-world network models. Eur Phys J B 13(3):547–560
Google Scholar
Bodlaender HL, Tan RB, van Leeuwen J (2000) Finding a delta-regular supergraph of minimum order. Tech Rep UU-CS-2000-29, Dept of Computer Science, Utrecht University, Utrecht
Chakrabarti, D., Faloutsos, C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2. doi:10.1145/1132952.1132954
Google Scholar
Cheng J, Fu AWC, Liu J (2010) K-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of ACM Special Interest Group on Management of Data (SIGMOD), pp 459–470
Chester S, Srivastava G (2011) Social network privacy for attribute disclosure attacks. In: Proceedings of advances in social networks analysis and mining (ASONAM)
Chester S, Kapron B, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2011) k-anonymization of social networks by vertex addition. In: Proceedings of advances in databases and information systems (ADBIS)
Chester S, Gaertner J, Stege U, Venkatesh S (2012a) Anonymizing subsets of social networks with degree constrained subgraphs. In: Proceedings of advances in social networks analysis and mining (ASONAM)
Chester S, Kapron B, Srivastava G, Venkatesh S (2012b) Complexity of social network anonymization. Soc Netw Anal Min. doi:10.1007/s13278-012-0059-7
Costa LdF, Rodrigues FA, Travieso G, Villas Boas PR (2007) Characterization of complex networks: a survey of measurements. Adv Phys 56:167–242
Article Google Scholar
Domingo-Ferrer J (ed) (2002) Inference Control in statistical databases, from theory to practice. In: Lecture Notes in Computer Science, vol 2316. Springer, Berlin
Dwork C (2006) Differential privacy. In: ICALP. Springer, Berlin, pp 1–12
Erdős P, Kelly P (1967) The minimal regular graph containing a given graph. Am Math Month 70:1074–1075
Article Google Scholar
Estrada E, Rodriguez-Velazquez JA (2005) Spectral measures of bipartivity in complex networks. Phys Rev E 72(4):046105. doi:10.1103/PhysRevE.72.046105
Google Scholar
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. SIGCOMM Comput Commun Rev 29(4):251–262. doi:10.1145/316194.316229
Article Google Scholar
Ferri F, Grifoni P, Guzzo T (2012) New forms of social and professional digital relationships: the case of facebook. Soc Netw Anal Min 2(2):121–137
Article Google Scholar
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99:7821–7826
Article MathSciNet MATH Google Scholar
González JJS (2002) Extending cell suppression to protect tabular data against several attackers. In: Inference Control in Statistical Databases, pp 34–58
Hay M, Miklau G, Jensen D, Towsley DF, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc Very Large Datab 1(1):102–114
Google Scholar
Heer J (2005) Prefuse: a toolkit for interactive information visualization. In: CHI 05: Proceedings of the SIGCHI conference on human factors in computing systems. ACM Press, New York, pp 421–430
König D (1936) Akademische verlagsgesellschaft. Leipzig
Latora V, Marchiori M (2001) Efficient behavior of small-world networks. Phys Rev Lett 87. doi:10.1103/PhysRevLett.87.198701
Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: Densification laws, shrinking diameters and possible explanations. In: Proceedings of international conference on knowledge discovery and data mining (KDD)
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Proceedings of conference on world wide web (WWW), pp 695–704
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of of IEEE 23rd international conference on data engineering (ICDE07)
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of ACM Special Interest Group on Management of Data (SIGMOD), pp 93–106
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1). doi:10.1145/1217299.1217302
McSherry F, Mironov I (2009) Differentially private recommender systems: building privacy into the netflix prize contenders. In: Proceedings of international conference on knowledge discovery and data mining (KDD), pp 627–636
Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Principles of database systems, pp 223–228
Milgram S (1967) The small world problem. Psychol Today 2:60–67
Google Scholar
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3). doi:10.1103/PhysRevE.74.036104
Robertson DA, Ethier R (2002) Cell suppression: experience and theory. In: Inference control in statistical databases, pp 8–20
Sweeney L (2002) k-anonymity: A model for protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570
Article MathSciNet MATH Google Scholar
Thompson B, Yao D (2009) The union-split algorithm and cluster-based anonymization of social networks. In: Proceedings of ACM symposium on information, computer and communications security (ASIACCS), pp 218–227
Wang Y, Xie L, Zheng B, Lee KCK (2011) Utility-oriented k-anonymization on social networks. In: Proceedings of the 16th international conference on Database systems for advanced applications, vol Part I, DASFAA’11. Springer, Berlin, pp 78–92
Wu W, Xiao Y, Wang W, He Z, Wang Z (2010) k-symmetry model for identity anonymization in social networks. In: Proceedings of international conference on extending database technology (EDBT), pp 111–122
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and k-degree anonymization schemes for privacy preserving social network publishing. In: Proceedings of 3rd workshop on social network mining and analysis (SNA-KDD). ACM, New York, pp 10:1–10:10
Yuan M, Chen L, Yu PS (2010) Personalized privacy protection in social networks. Proc Very Large Datab 4(2):141–150
Google Scholar
Zheleva E, Getoor L (2007) Preserving the privacy of sensitive relationships in graph data. In: Proceedings of privacy, security, and trust in KDD (PinKDD), pp 153–171
Zhou B, Pei J (2011) The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowledge Information Systems 28(1):47–77
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Victoria, PO Box 3055, STN CSC, Victoria, BC, V8W 3P6, Canada
Sean Chester, Bruce M. Kapron, Gautam Srivastava, Alex Thomo & S. Venkatesh
Yahoo! Inc., 4401 Great America Parkway, Santa Clara, CA, 95054, USA
Ganesh Ramesh

Authors

Sean Chester
View author publications
You can also search for this author in PubMed Google Scholar
Bruce M. Kapron
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh Ramesh
View author publications
You can also search for this author in PubMed Google Scholar
Gautam Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Alex Thomo
View author publications
You can also search for this author in PubMed Google Scholar
S. Venkatesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sean Chester.

Additional information

A preliminary, short version (Chester et al. 2011) of this paper appeared at ADBIS 2011.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chester, S., Kapron, B.M., Ramesh, G. et al. Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes. Soc. Netw. Anal. Min. 3, 381–399 (2013). https://doi.org/10.1007/s13278-012-0084-6

Download citation

Received: 05 December 2011
Revised: 28 June 2012
Accepted: 04 September 2012
Published: 26 September 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s13278-012-0084-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes

Abstract

Access this article

Similar content being viewed by others

Conditional adjacency anonymity in social graphs under active attacks

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

Improved Upper and Lower Bound Heuristics for Degree Anonymization in Social Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Why Waldo befriended the dummy? k-Anonymization of social networks with pseudo-nodes

Abstract

Access this article

Similar content being viewed by others

Conditional adjacency anonymity in social graphs under active attacks

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

Improved Upper and Lower Bound Heuristics for Degree Anonymization in Social Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation