The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

Zhou, Bin; Pei, Jian

doi:10.1007/s10115-010-0311-2

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

Regular Paper
Published: 16 June 2010

Volume 28, pages 47–77, (2011)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Bin Zhou¹ &
Jian Pei¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Recently, more and more social network data have been published in one way or another. Preserving privacy in publishing social network data becomes an important concern. With some local knowledge about individuals in a social network, an adversary may attack the privacy of some victims easily. Unfortunately, most of the previous studies on privacy preservation data publishing can deal with relational data only, and cannot be applied to social network data. In this paper, we take an initiative toward preserving privacy in social network data. Specifically, we identify an essential type of privacy attacks: neighborhood attacks. If an adversary has some knowledge about the neighbors of a target victim and the relationship among the neighbors, the victim may be re-identified from a social network even if the victim’s identity is preserved using the conventional anonymization techniques. To protect privacy against neighborhood attacks, we extend the conventional k-anonymity and l-diversity models from relational data to social network data. We show that the problems of computing optimal k-anonymous and l-diverse social networks are NP-hard. We develop practical solutions to the problems. The empirical study indicates that the anonymized social network data by our methods can still be used to answer aggregate network queries with high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online social networks security and privacy: comprehensive review and analysis

Article Open access 01 June 2021

Big data privacy: a technological perspective and review

Article Open access 26 November 2016

Big Data Security and Privacy

References

Adamic L, Adar E (2005) How to search a social network. Soc Netw 27(3): 187–203
Article Google Scholar
Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), ACM Press, New York, pp 44–54
Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th international conference on World Wide Web (WWW’07), ACM Press, New York, pp 181–190
Bhagat S, Cormode G, Krishnamurthy B, Srivastava D (2009) Class-based graph anonymization for social network data. PVLDB 2(1): 766–777
Google Scholar
Campan A, Truta TM (2008) A clustering approach for data and structural anonymity in social networks. In: Proceedings of the 2nd ACM SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD’08), in conjunction with KDD’08, Las Vegas, Nevada
Chakrabarti D, Zhan Y, Faloutsos C (2004) R-mat: a recursive model for graph mining. In: Proceedings of the 2004 SIAM international conference on data mining (SDM’04), SIAM, Philadelphia
Cormen TH, Leiserson CE, Rivest RL, Stein C (2002) Introduction to algorithms, 2nd edn. MIT Press and McGraw-Hill, Cambridge
Google Scholar
Cormode G, Srivastava D, Yu T, Zhang Q (2008) Anonymizing bipartite graph data using safe groupings. PVLDB 1(1): 833–844
Google Scholar
Coull SE, Monrose F, Reiter MK, Bailey M (2009) The challenges of effectively anonymizing network data. In: Proceedings of the 2009 cybersecurity applications & technology conference for homeland security (CATCH’09), IEEE Computer Society, Washington, DC, pp 230–236
Dwork C (2008) Differential privacy: a survey of results. In: Proceedings of the 5th international conference on theory and applications of models of computation. Lecture notes in computer science, vol 4978. Springer, pp 1–19
Dwork C, Smith A (2008) Differential privacy for statistics: what we know and what we want to learn. In: Proceedings of NCHS/CDC data confidentiality workshop
Faloutsos M, Faloutsos P, Faloutsos C (1999) On power law relationships of the internet topology. In: Proceedings of the conference on applications, technologies, architectures, and protocols for computer communication (SIGCOMM’99), ACM Press, New York, pp 251–262
Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W. H. Freeman & Co., New York
MATH Google Scholar
Getoor L, Diehl CP (2005) Link mining: a survey. ACM SIGKDD Explor Newsl 7(2): 3–12
Article Google Scholar
Gkoulalas-Divanis A, Verykios VS (2009) Hiding sensitive knowledge without side effects. Knowl Inf Syst 20(3): 263–299
Article Google Scholar
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Tech. Rep. 07-19, University of Massachusetts Amherst
Hay M, Miklau G, Jensen D, Towsley D (2008) Resisting structural identification in anonymized social networks. PVLDB 1(1): 102–114
Google Scholar
Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: Proceedings of the 2009 ninth IEEE international conference on data mining (ICDM’09), IEEE Computer Society, Washington, DC, pp 169–178
Hazan E, Safra S, Schwartz O (2003) On the complexity of approximating k-dimensional matching. In: Proceedings of the 6th international workshop on approximation algorithms for combinatorial optimization problems and of the 7th international workshop on randomization and computation techniques in computer science (RANDOM-APPROX’03), LNCS, vol 2764. Springer, Berlin, pp 83–97
Korolova A, Motwani R, Nabar SU, Xu Y (2008) Link privacy in social networks. In: Proceedings of the 24th international conference on data engineering (ICDE’08), IEEE, pp 1355–1357
Kossinets G, Watts DJ (2006) Empirical analysis of an evolving social network. Science 311(5757): 88–90
Article MathSciNet Google Scholar
Kumar R, Novak J, Tomkins A (2006) Structure and evolution of online social networks. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’06), ACM Press, New York, pp 611–617
Li N, Li T, Venkatasubramanian S (2007) t-Closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd international conference on data engineering (ICDE’07), IEEE, pp 106–115
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data (SIGMOD’08), ACM Press, New York, pp 93–106
Liu K, Das K, Grandison T, Kargupta H (2008) Privacy-preserving data analysis on graphs and social networks. In: Kargupta H, Han J, Yu P, Motwani R, Kumar V (eds) Next generation data mining. CRC Press, Boca Raton
Google Scholar
Luo H, Fan J, Lin X, Zhou A, Bertino E (2009) A distributed approach to enabling privacy-preserving model-based classifier training. Knowl Inf Syst 20(2): 157–185
Article Google Scholar
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) L-diversity: privacy beyond k-anonymity. In: Proceedings of the 22nd IEEE international conference on data engineering (ICDE’06), IEEE Computer Society, Washington, DC
Machanavajjhala A, Kifer D, Abowd JM, Gehrke J, Vilhuber L (2008) Privacy: theory meets practice on the map. In: Proceedings of the 24th international conference on data engineering (ICDE’08), pp 277–286
Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’04), ACM, New York, pp 223–228
Muhlestein D, Lim S (2010) Online learning with social computing based interest sharing. Knowl Inf Syst. doi:10.1007/s10115-009-0265-4
Qiu L, Li Y, Wu X (2008) Protecting business intelligence and customer privacy while outsourcing data mining tasks. Knowl Inf Syst 17(2): 99–120
Article Google Scholar
Rastogi V, Hay M, Miklau G, Suciu D (2009) Relationship privacy: output perturbation for queries with joins. In: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’09), ACM, New York, pp 107–116
Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng (TKDE) 13(6): 1010–1027
Article Google Scholar
Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information. In: Proceedings of the 7th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems (PODS’98), ACM Press, New York, p 188
Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5): 557–570
Article MATH MathSciNet Google Scholar
Wang DW, Liau CJ, Hsu TS (2006) Privacy protection in social network data disclosure based on granular computing. In: Proceedings of the 2006 IEEE international conference on fuzzy systems, Vancouver, BC, pp 997–1003
Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, New York
Google Scholar
Xiao X, Tao Y (2006) Anatomy: simple and effective privacy preservation. In: Dayal U, Whang KY, Lomet DB, Alonso G, Lohman GM, Kersten ML, Cha SK, Kim YK (eds) Proceedings of the 32nd international conference on very large data bases (VLDB’06), ACM, pp 139–150
Xiao X, Tao Y (2006b) Personalized privacy preservation. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD’06), ACM Press, New York, pp 229–240
Xiao X, Tao Y (2007) M-invariance: towards privacy preserving re-publication of dynamic datasets. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data (SIGMOD’07), ACM, New York, pp 689–700
Xiao X, Tao Y (2008) Output perturbation with query relaxation. PVLDB 1(1): 857–869
Google Scholar
Xu J, Wang W, Pei J, Wang X, Shi B, Fu AWC (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), ACM Press, New York, pp 785–790
Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE international conference on data mining (ICDM’02), IEEE Computer Society, Washington, DC, p 721
Yan X, Yu PS, Han J (2004) Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data (SIGMOD’04), ACM Press, New York, pp 335–346
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the 2008 SIAM international conference on data mining (SDM’08), SIAM, pp 739–750
Ying X, Wu X (2009a) On link privacy in randomizing social networks. In: Proceedings of the 13th Pacific-Asia conference on advances in knowledge discovery and data mining, Springer, pp 28–39
Ying X, Wu X (2009b) On randomness measures for social networks. In: Proceedings of the 2009 SIAM international conference on data mining, SIAM, pp 709–720
Zheleva E, Getoor L (2007) Preserving the privacy of sensitive relationships in graph data. In: Proceedings of the 1st ACM SIGKDD workshop on privacy, security, and trust in KDD (PinKDD’07)
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th IEEE international conference on data engineering (ICDE’08), IEEE Computer Society, Cancun, pp 506–515
Zhou B, Pei J, Luk WS (2008) A brief survey on anonymization techniques for privacy preserving publishing of social network data. SIGKDD Explor 10(2): 12–22
Article Google Scholar
Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. PVLDB 2(1): 946–957
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Bin Zhou & Jian Pei

Authors

Bin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jian Pei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Zhou.

Additional information

A preliminary version of this paper appears as Zhou and Pei [49]. This research is supported in part by an NSERC Discovery Grant and an NSERC Discovery Accelerator Supplement Grant. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, B., Pei, J. The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl Inf Syst 28, 47–77 (2011). https://doi.org/10.1007/s10115-010-0311-2

Download citation

Received: 16 November 2009
Revised: 28 April 2010
Accepted: 31 May 2010
Published: 16 June 2010
Issue Date: July 2011
DOI: https://doi.org/10.1007/s10115-010-0311-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Big data privacy: a technological perspective and review

Big Data Security and Privacy

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks

Abstract

Access this article

Similar content being viewed by others

Online social networks security and privacy: comprehensive review and analysis

Big data privacy: a technological perspective and review

Big Data Security and Privacy

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation