k-Degree anonymity and edge selection: improving data utility in large networks

Casas-Roma, Jordi; Herrera-Joancomartí, Jordi; Torra, Vicenç

doi:10.1007/s10115-016-0947-7

k-Degree anonymity and edge selection: improving data utility in large networks

Regular Paper
Published: 27 April 2016

Volume 50, pages 447–474, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jordi Casas-Roma ORCID: orcid.org/0000-0002-0617-3303¹,
Jordi Herrera-Joancomartí² &
Vicenç Torra³

1272 Accesses
50 Citations
2 Altmetric
Explore all metrics

Abstract

The problem of anonymization in large networks and the utility of released data are considered in this paper. Although there are some anonymization methods for networks, most of them cannot be applied in large networks because of their complexity. In this paper, we devise a simple and efficient algorithm for k-degree anonymity in large networks. Our algorithm constructs a k-degree anonymous network by the minimum number of edge modifications. We compare our algorithm with other well-known k-degree anonymous algorithms and demonstrate that information loss in real networks is lowered. Moreover, we consider the edge relevance in order to improve the data utility on anonymized networks. By considering the neighbourhood centrality score of each edge, we preserve the most important edges of the network, reducing the information loss and increasing the data utility. An evaluation of clustering processes is performed on our algorithm, proving that edge neighbourhood centrality increases data utility. Lastly, we apply our algorithm to different large real datasets and demonstrate their efficiency and practical utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

k-Degree anonymity on directed networks

Article 01 September 2018

The effect of distant connections on node anonymity in complex networks

Article Open access 12 January 2024

Notes

A preliminary, short version [9] of this paper appeared at ASONAM 2013.
Java implementation and source code available at: https://jcasasr.wordpress.com.

References

Adamic LA, Glance N (2005) The political blogosphere and the 2004 US election. In: International workshop on link discovery. ACM, New York, NY, USA, pp 36–43
Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x? Anonymized social networks, hidden patterns, and structural steganography. In: International conference on world wide web. ACM, New York, NY, USA, pp 181–190
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008
Article Google Scholar
Boldi P, Bonchi F, Gionis A, Tassa T (2012) Injecting uncertainty in graphs for identity obfuscation. Proc VLDB Endow 5(11):1376–1387
Article Google Scholar
Bredereck R, Froese V, Hartung S, Nichterlein A, Niedermeier R, Talmon N (2014) The complexity of degree anonymization by vertex addition. In: Proceedings of the 10th international conference on algorithmic aspects in information and management. Springer, Vancouver, BC, Canada, pp 44–55
Cai B-J, Wang H-Y, Zheng H-R, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: International conference on machine learning and cybernetics. IEEE, Qingdao, China, pp 1849–1854
Campan A, Alufaisan Y, Truta TM (2015) Preserving communities in anonymized social networks. Trans Data Priv 8(1):55–87
Google Scholar
Campan A, Truta TM (2009) Data and structural $k$-anonymity in social networks. In: Bonchi F, Ferrari E, Jiang W, Malin B (eds) Privacy, security, and trust in KDD. Springer, New York, pp 33–54
Casas-Roma J, Herrera-Joancomartí J, Torra V (2013) An algorithm for $k$-degree anonymity on large networks. In: IEEE international conference on advances on social networks analysis and mining. IEEE, Niagara Falls, CA, USA, pp 671–675
Casas-Roma J, Herrera-Joancomartí J, Torra V (2013) Analyzing the impact of edge modifications on networks. In: International conference on modeling decisions for artificial intelligence. Springer, Barcelona, Spain, pp 296–307
Cheng J, Fu AW, Liu J (2010) $k$-isomorphism: privacy preserving network publication against structural attacks. In: International conference on management of data. ACM, New York, NY, USA, pp 459–470
Chester S, Kapron BM, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2011) $k$-anonymization of social networks by vertex addition. In: ADBIS 2011 research communications, pp 107–116
Chester S, Gaertner J, Stege U, Venkatesh S (2012) Anonymizing subsets of social networks with degree constrained subgraphs. In: IEEE International conference on advances on social networks analysis and mining. IEEE, Washington, DC, USA, pp 418–422
Chester S, Kapron BM, Ramesh G, Srivastava G, Thomo A, Venkatesh S (2013) Why Waldo befriended the dummy? $k$-anonymization of social networks with pseudo-nodes. Soc Netw Anal Min 3(3):381–399
Article Google Scholar
Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):1–6
Article Google Scholar
De Capitani di Vimercati S, Foresti S, Livraga G, Samarati P (2012) Data privacy: definitions and techniques. Int J Uncertain Fuzziness Knowl Based Syst 20(6):793–818
Article MATH Google Scholar
Dwork C (2006) Differential privacy. In: International conference on automata, languages and programming, vol 4052, pp 1–12
Dwork C (2011) A firm foundation for private data analysis. Commun ACM 54(1):86–95
Article Google Scholar
Ferri F, Grifoni P, Guzzo T (2011) New forms of social and professional digital relationships: the case of Facebook. Soc Netw Anal Min 2(2):121–137
Article Google Scholar
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Article MathSciNet MATH Google Scholar
Hansen SL, Mukherjee S (2003) A polynomial algorithm for optimal univariate microaggregation. IEEE Trans Knowl Data Eng 15(4):1043–1044
Article Google Scholar
Hartung S, Hoffmann C, Nichterlein A (2014) Improved upper and lower bound heuristics for degree anonymization in social networks. In: Proceedings of the 13th international symposium on experimental algorithms. Springer, Copenhagen, Denmark, pp 376–387
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. Technical Report No. 07-19, Computer Science Department, University of Massachusetts, Amherst
Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114
Article Google Scholar
Hay M, Li C, Miklau G, Jensen D (2009) Accurate estimation of the degree distribution of private networks. In: IEEE International conference on data mining (ICDM). IEEE, Miami, FL, USA, pp 169–178
Hay M, Liu K, Miklau G, Pei J, Terzi E (2011) Privacy-aware data management in information networks. In: International conference on management of data. ACM Press, New York, NY, USA, pp 1201–1204
Krebs V (2006). US politics book. Retrieved from http://www.orgnet.com
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117
Article Google Scholar
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):1–40
Article Google Scholar
Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5:1–5:46
Article Google Scholar
Leskovec J, Lang K, Dasgupta A, Mahoney M (2009) Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123
Article MathSciNet MATH Google Scholar
Li N, Li T, Venkatasubramanian S (2007) $t$-closeness: privacy beyond $k$-anonymity and $\ell $-diversity. In: IEEE international conference on data engineering. IEEE, pp 106–115
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In ACM SIGMOD international conference on management of data. ACM, New York, NY, USA, pp 93–106
Lu X, Song Y, Bressan S (2012) Fast identity anonymization on graphs. In: Proceedings of the 23rd international conference on database and expert systems applications. Springer, Vienna, Austria, pp 281–295
Nagle F (2013) Privacy breach analysis in social networks. In: Özyer T, Erdem Z, Rokne J, Khoury S (eds) Mining social networks and security informatics. Springer, Dordrecht, pp 63–77
Chapter Google Scholar
Nguyen HH, Imine A, Rusinowitch M (2015) Anonymizing social graphs via uncertainty semantics. In: Proceedings of the 10th ACM symposium on information, computer and communications security. ACM, Singapore, pp 495–506
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) $\ell $-diversity: privacy beyond k$k$-anonymity. ACM Trans Knowl Discov Data 1(1):3:1–3:12
Article Google Scholar
McSherry F, Mironov I (2009) Differentially private recommender systems. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 627–636
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: 20th international symposium computer and information sciences. Springer, Istanbul, Turkey, pp 284–293
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105(4):1118–1123
Article Google Scholar
Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027
Article Google Scholar
Sweeney L (2002) $k$-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Article MathSciNet MATH Google Scholar
Tripathy BK, Panda GK (2010) A new approach to manage security against neighborhood attacks in social networks. In: IEEE international conference on advances on social networks analysis and mining. IEEE, pp 264–269
Wu W, Xiao Y, Wang W, He Z, Wang Z (2010) $k$-symmetry model for identity anonymization in social networks. In: International conference on extending database technology. ACM, New York, NY, USA, pp 111–122
Yahoo! Webscope, Yahoo! Instant Messenger friends connectivity graph, version 1.0. http://research.yahoo.com/Academic_Relations
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: SIAM Conference on data mining. SIAM, Atlanta, GA, USA, pp 739–750
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and $k$-degree anonymization schemes for privacy preserving social network publishing. In: Workshop on social network mining and analysis. ACM, New York, NY, USA, pp 10:1–10:10
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Article Google Scholar
Zhang K, Lo D, Lim E-P, Prasetyo PK (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst (KAIS) 35(3):553–583
Article Google Scholar
Zheleva E, Getoor L (2011) Privacy in social networks: a survey. Social network data analytics. Springer, Berlin
Google Scholar
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: IEEE International conference on data engineering (ICDE). IEEE, Washington, DC, USA, pp 506–515
Zhou B, Pei J (2011) The $k$-anonymity and $\ell $-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowl Inf Syst 28(1):47–77
Article Google Scholar
Zhou B, Pei J, Luk W (2008) A brief survey on anonymization techniques for privacy preserving publishing of social network data. ACM SIGKDD Explor Newsl 10(2):12–22
Article Google Scholar
Zou L, Chen L, Özsu MT (2009) $k$-automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957
Article Google Scholar

Download references

Acknowledgments

This work was partly funded by the Spanish Government through Grants TIN2011-27076-C03-02 “CO-PRIVACY” and TIN2014-57364-C2-2-R “SMARTGLACIS.”

Author information

Authors and Affiliations

Faculty of Computer Science, Multimedia and Telecommunications, Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya (UOC), Barcelona, Spain
Jordi Casas-Roma
Department of Information and Communications Engineering, Universitat Autònoma de Barcelona (UAB), Bellaterra, Spain
Jordi Herrera-Joancomartí
School of Informatics, University of Skövde, Skövde, Sweden
Vicenç Torra

Authors

Jordi Casas-Roma
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Herrera-Joancomartí
View author publications
You can also search for this author in PubMed Google Scholar
Vicenç Torra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jordi Casas-Roma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Casas-Roma, J., Herrera-Joancomartí, J. & Torra, V. k-Degree anonymity and edge selection: improving data utility in large networks. Knowl Inf Syst 50, 447–474 (2017). https://doi.org/10.1007/s10115-016-0947-7

Download citation

Received: 03 October 2014
Revised: 01 November 2015
Accepted: 08 April 2016
Published: 27 April 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s10115-016-0947-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

k-Degree anonymity and edge selection: improving data utility in large networks

Abstract

Access this article

Similar content being viewed by others

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

k-Degree anonymity on directed networks

The effect of distant connections on node anonymity in complex networks

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

k-Degree anonymity and edge selection: improving data utility in large networks

Abstract

Access this article

Similar content being viewed by others

A Summary of $$k$$ -Degree Anonymous Methods for Privacy-Preserving on Networks

k-Degree anonymity on directed networks

The effect of distant connections on node anonymity in complex networks

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation