Graph k-Anonymity through k-Means and as Modular Decomposition

  • Klara Stokes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8208)

Abstract

In this paper we discuss k-anonymous graphs in terms of modular decomposition and we present two algorithms for the k-anonym-ization of graphs with respect to neighborhoods. This is the strictest definition of k-anonymity for graphs. The first algorithm is an adaptation of the k-means algorithm to neighborhood clustering in graphs. The second algorithm is distributed of message passing type, and therefore enables user-privacy: the individuals behind the vertices can jointly protect their own privacy. Although these algorithms are not optimal in terms of information loss, they are a first example of algorithms that provide k-anonymization of graphs with respect to the strictest definition, and they are simple to implement.

Keywords

k-anonymity graph modular decomposition message passing distributed algorithm k-means user-privacy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    AOL query logs, www.aolstalker.com
  2. 2.
    Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Campan, A., Truta, T.M.: Data and structural anonymity in social networks. In: ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD (2008)Google Scholar
  4. 4.
    Dalenius, T.: Finding a needle in a haystack. Journal of Official Statistics 2(3), 329–336 (1986)Google Scholar
  5. 5.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)MathSciNetMATHGoogle Scholar
  6. 6.
    Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogenerous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Gallai, T.: Transitiv orientierbare graphen. Acta Mathematica Acad. Sci. Hungar. 18, 25–66 (1967)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Habib, M., Paul, C.: A survey of the algorithmic aspects of modular decomposition. Journal of Computer Science Review 4(1), 41–59 (2010)CrossRefGoogle Scholar
  9. 9.
    Hay, M., Miklau, G., Jensen, D., Towsley, D., Weis, P.: Resisting structural reidentification in anonymized social networks. In: International Conference on Very Large Data Bases (2008)Google Scholar
  10. 10.
    Hedetniemi, S.T., Laskar, R.C.: Bibliography on domination in graphs and some basic definitions of domination parameters. Discrete Mathematics 86(1-3), 257–277 (1990)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Schulte Nordholt, E., Spicer, K., de Wolf, P.-P.: Statistical Disclosure Control. Wiley (2012)Google Scholar
  12. 12.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall (1981)Google Scholar
  13. 13.
    Lloyd, S.P.: Least square quantization in PCM. Bell Telephone Laboratories Paper (1957); Later published as: Least squares quantization in PCM. IEEE Transactions on Information Theory 28(2), 129–137 (1982)Google Scholar
  14. 14.
    Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: ACM SIGMOD International Conference on Management of Data, pp. 93–106 (2008)Google Scholar
  15. 15.
    Liu, K., Das, K., Grandison, T., Kargupta, H.: Privacy-preserving data analysis on graphs and social networks. In: Kargupta, H., Han, J., Yu, P., Motwani, R., Kumar, V. (eds.) Next Generation of Data Mining. CRC Press (2008)Google Scholar
  16. 16.
    Lorrain, F., White, H.C.: Structural equivalence of individuals in social networks. Journal of Mathematical Sociology 1, 49–80 (1971)CrossRefGoogle Scholar
  17. 17.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. J. ACM Trans. Knowl. Discov. Data 1(1), 3 (2007)CrossRefGoogle Scholar
  18. 18.
    Mooij, J.M., Kappen, H.J.: Sufficient Conditions for Convergence of the SumProduct Algorithm. IEEE Transactions on Information Theory 53(12), 4422–4437 (2007)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Nettleton, D.F., Torra, V.: Data Privacy for Simply Anonymized Social Network Logs Represented as Graphs - Considerations for Graph alteration Operations. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 19(suppl. 1), 107–125 (2011)CrossRefGoogle Scholar
  20. 20.
    Newman, M.E.J., Watts, D.J., Strogatz, S.H.: Random graph models of social networks. Proc. Nat. Acad. Sci. USA 99, 2566–2572 (2002)CrossRefMATHGoogle Scholar
  21. 21.
    Pearl, J.: Reverend Bayes on inference engines: A distributed hierarchical approach. In: Proceedings of the Second National Conference on Artificial Intelligence, AAAI 1982, Pittsburgh, PA, pp. 133–136. AAAI Press, Menlo Park (1982)Google Scholar
  22. 22.
    Peña, J.M., Lozano, J.A., Larrañaga, P.: An Empirical Comparison of Four Initialization Methods for the K-Means Algorithm. Pattern Recognition Letters 20(10), 1027–1040 (1999)CrossRefGoogle Scholar
  23. 23.
    Samarati, P.: Protecting Respondents’ Identities in Microdata Release. IEEE Trans. on Knowledge and Data Engineering 13(6), 1010–1027 (2001)CrossRefGoogle Scholar
  24. 24.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k–anonymity and its enforcement through generalization and suppression. SRI Intl. Tech. Rep. (1998)Google Scholar
  25. 25.
    Stein, W.A., et al.: Sage Mathematics Software (Version 5.0), The Sage Development Team (2012), http://www.sagemath.org
  26. 26.
    Stokes, K., Torra, V.: Reidentification and k-anonymity: a model for disclosure risk in graphs. Soft Computing 16(10), 1657–1670 (2012)CrossRefMATHGoogle Scholar
  27. 27.
    Stokes, K., Torra, V.: On some clustering approaches for graphs. In: FUZZ-IEEE 2011, pp. 409–415 (2011)Google Scholar
  28. 28.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 557–570 (2002)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13(2), 260–269 (1967)CrossRefMATHGoogle Scholar
  30. 30.
    Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: IEEE International Conference on Data Engineering (2008)Google Scholar
  31. 31.
    Zou, L., Chen, L., Tamer Özsu, M.: k-Automorphism: a general framework for privacy preserving network publication. VLDB Endowment 2(1), 946–957 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Klara Stokes
    • 1
  1. 1.Dept. of Computer and Information ScienceLinköping UniversityLinköpingSweden

Personalised recommendations