Anonymizing graphs: measuring quality for clustering

Casas-Roma, Jordi; Herrera-Joancomartí, Jordi; Torra, Vicenç

doi:10.1007/s10115-014-0774-7

Anonymizing graphs: measuring quality for clustering

Regular Paper
Published: 06 August 2014

Volume 44, pages 507–528, (2015)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jordi Casas-Roma¹,
Jordi Herrera-Joancomartí² &
Vicenç Torra³

557 Accesses
14 Citations
Explore all metrics

Abstract

Anonymization of graph-based data is a problem, which has been widely studied last years, and several anonymization methods have been developed. Information loss measures have been carried out to evaluate the noise introduced in the anonymized data. Generic information loss measures ignore the intended anonymized data use. When data has to be released to third-parties, and there is no control on what kind of analyses users could do, these measures are the standard ones. In this paper we study different generic information loss measures for graphs comparing such measures to the cluster-specific ones. We want to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Maximum Variance Approach for Graph Anonymization

DUEF-GA: data utility and privacy evaluation framework for graph anonymization

Article 23 September 2019

An evaluation of vertex and edge modification techniques for privacy-preserving on graphs

Article 15 June 2019

References

Aggarwal CC, Wang H (eds) (2010) Managing and mining graph data. Springer, New York
MATH Google Scholar
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech 10:P10008
Article Google Scholar
Budi A, Lo D, Jiang L, Lucia (2011) \(kb\)-Anonymity: a model for anonymized behaviour-preserving test and debugging data. ACM SIGPLAN conference on programming language design and implementation (PLDI). ACM Press, New York, pp 447–457
Cai B-J, Wang H-Y, Zheng H-R, Wang H (2010) Evaluation repeated random walks in community detection of social networks. In: 2010 International conference on machine learning and cybernetics (ICMLC). IEEE Computer Society, Qingdao, pp 1849–1854
Casas-Roma J, Herrera-Joancomartí J, Torra V (2013) An algorithm for \(k\)-degree anonymity on large networks. In: Proceedings of the 2013 international conference on advances on social networks analysis and mining (ASONAM). IEEE Computer Society, Niagara Falls, pp 671–675
Chakrabarti D and Faloutsos C (2006) Graph mining: Laws, generators, and algorithms. ACM Comput Surv 38(1):2:1–2:69
Clauset A, Newman MEJ, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111
Article Google Scholar
Cormode G, Srivastava D, Yu T, Zhang Q (2010) Anonymizing bipartite graph data using safe groupings. Proc VLDB Endow 19(1):115–139
Google Scholar
Das S, Egecioglu Ö, Abbadi A (2010) Anonymizing weighted social network graphs. In: IEEE 26th international conference on data engineering (ICDE). IEEE Computer Society, Long Beach, pp 904–907
Dongen S-M (2000) Graph clustering by flow simulation. Dissertation, University of Utrecht
Dwork C (2006) Differential privacy. In: Proceedings of the 33rd international conference on automata, languages and programming (ICALP). Springer, Berlin, pp 1–12
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Article MathSciNet MATH Google Scholar
Gleiser PM, Danon L (2003) Community structure in jazz. Adv Complex Syst 6(04):565–573
Article Google Scholar
Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68(6):065103
Article Google Scholar
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks, Report. University of Massachusetts, Amherst
Google Scholar
Hay M, Miklau G, Jensen D, Towsley D, Weis P (2008) Resisting structural re-identification in anonymized social networks. Proc VLDB Endow 1(1):102–114
Article Google Scholar
Hay M, Li C, Miklau G, Jensen D (2009) Accurate Estimation of the Degree Distribution of Private Networks. In: 9th International conference on data mining (ICDM). IEEE Computer Society, Miami, pp 169–178
Herrera-Joancomartí J, Pérez-Solà C (2011) Online social Honeynets: trapping web crawlers in OSN. In: Proceedings of the 2011 international conference on modeling decisions for artificial intelligence (MDAI). Springer, Girona, pp 115–131
Lancichinetti A and Fortunato S (2009) Community detection algorithms: a comparative analysis. In: Proceedings of the fourth international ICST conference on performance evaluation methodologies and tools. ICST, Pisa, pp 27:1–27:2
Li N, Li T, Venkatasubramanian S (2007) \(t\)-Closeness: privacy beyond \(k\)-anonymity and \(l\)-diversity. In: 23rd International conference on data engineering (ICDE). IEEE Computer Society, Istanbul, pp 106–115
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the ACM international conference on management of data (SIGMOD). ACM Press, New York, pp 93–106
Lucia Lo D, Jiang L, Budi A (2012) \(kb^{e}\)-Anonymity: test data anonymization for evolving programs. In: International conference on automated software engineering (ASE). ACM Press, New York, pp 262–265
Machanavajjhala A, Kifer D, Gehrke J and Venkitasubramaniam M (2007) \(l\)-diversity: privacy beyond \(k\)-anonymity. ACM Trans Knowl Discov Data 1(1):3:1–3:12
Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of the 7th ACM conference on internet measurement (ICM). ACM Press, New York, pp 29–42
Newman MEJ, Girvan M (2003) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Article Google Scholar
Pons P, Latapy M (2005) Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2):191–218
Article MathSciNet Google Scholar
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105(4):1118–1123
Article Google Scholar
Sweeney L (2002) \(k\)-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570
Article MathSciNet MATH Google Scholar
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the SIAM international conference on data mining (SDM). SIAM, Atlanta, pp 739–750
Ying X, Pan K, Wu X and Guo L (2009) Comparisons of randomization and \(k\)-degree anonymization schemes for privacy preserving social network publishing. In: Proceedings of the 3rd workshop on social network mining and analysis (SNA-KDD). ACM Press, New York, pp 10:1–10:10
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Google Scholar
Zhang K, Lo D, Lim E, Prasetyo P (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst 35(3):553–583
Article Google Scholar
Zheleva E, Getoor L (2011) Privacy in social networks: a survey. In: Aggarwal CC (ed) Social network data analytics, 1st edn. Springer, Berlin, pp 277–306
Chapter Google Scholar
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th international conference on data engineering (ICDE). IEEE Computer Society, Washington, pp 506–515
Zou L, Chen L, Özsu MT (2009) \(K\)-Automorphism: a general framework for privacy preserving network publication. Proc VLDB Endow 2(1):946–957
Article Google Scholar

Download references

Acknowledgments

This work was partly funded by the Spanish Government through projects TIN2011-27076-C03-02 “CO-PRIVACY”, CONSOLIDER INGENIO 2010 CSD2007-0004 “ARES” and TIN2010-15764 “N-KHRONOUS”.

Author information

Authors and Affiliations

Universitat Oberta de Catalunya (UOC), Barcelona, Spain
Jordi Casas-Roma
Universitat Autònoma de Barcelona (UAB), Bellaterra, Spain
Jordi Herrera-Joancomartí
Artificial Intelligence Research Institute (IIIA), Spanish National Research Council (CSIC), Bellaterra, Spain
Vicenç Torra

Authors

Jordi Casas-Roma
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Herrera-Joancomartí
View author publications
You can also search for this author in PubMed Google Scholar
Vicenç Torra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jordi Casas-Roma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Casas-Roma, J., Herrera-Joancomartí, J. & Torra, V. Anonymizing graphs: measuring quality for clustering. Knowl Inf Syst 44, 507–528 (2015). https://doi.org/10.1007/s10115-014-0774-7

Download citation

Received: 16 October 2013
Revised: 11 July 2014
Accepted: 27 July 2014
Published: 06 August 2014
Issue Date: September 2015
DOI: https://doi.org/10.1007/s10115-014-0774-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Anonymizing graphs: measuring quality for clustering

Abstract

Access this article

Similar content being viewed by others

A Maximum Variance Approach for Graph Anonymization

DUEF-GA: data utility and privacy evaluation framework for graph anonymization

An evaluation of vertex and edge modification techniques for privacy-preserving on graphs

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Anonymizing graphs: measuring quality for clustering

Abstract

Access this article

Similar content being viewed by others

A Maximum Variance Approach for Graph Anonymization

DUEF-GA: data utility and privacy evaluation framework for graph anonymization

An evaluation of vertex and edge modification techniques for privacy-preserving on graphs

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation