Advertisement

Algorithmica

, Volume 80, Issue 11, pp 3397–3427 | Cite as

De-anonymization of Heterogeneous Random Graphs in Quasilinear Time

  • Karl Bringmann
  • Tobias Friedrich
  • Anton Krohmer
Article
  • 78 Downloads

Abstract

There are hundreds of online social networks with altogether billions of users. Many such networks publicly release structural information, with all personal information removed. Empirical studies have shown, however, that this provides a false sense of privacy—it is possible to identify almost all users that appear in two such anonymized network as long as a few initial mappings are known. We analyze this problem theoretically by reconciling two versions of an artificial power-law network arising from independent subsampling of vertices and edges. We present a new algorithm that identifies most vertices and makes no wrong identifications with high probability. The number of vertices matched is shown to be asymptotically optimal. For an n-vertex graph, our algorithm uses \(n^\varepsilon \) seed nodes (for an arbitrarily small \(\varepsilon \)) and runs in quasilinear time. This improves previous theoretical results which need \(\Theta (n)\) seed nodes and have runtimes of order \(n^{1+\Omega (1)}\). Additionally, the applicability of our algorithm is studied experimentally on different networks.

Keywords

Social networks Locality-sensitive hashing Network privacy 

Notes

Acknowledgements

We thank Silvio Lattanzi from Google Inc. for fruitful discussions, sharing their data sets, and sending us a preliminary version of [18] at the early stages of this project. Karl Bringmann is a recipient of the Google Europe Fellowship in Randomized Algorithms, and this research is supported in part by this Google Fellowship. Tobias Friedrich received funding from the German Research Foundation (DFG) under Grant Agreement No. FR 2988 (ADLON).

References

  1. 1.
    Aiello, W., Chung, F., Lu, L.: A random graph model for massive graphs. In: 32nd Annual ACM Symposium on Theory of Computing (STOC), pp. 171–180 (2000)Google Scholar
  2. 2.
    Aiello, W., Chung, F., Lu, L.: A random graph model for power law graphs. Exp. Math. 10(1), 53–66 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Alon, N., Spencer, J.H.: The Probabilistic Method, 3rd edn. Wiley, Hoboken (2008)CrossRefzbMATHGoogle Scholar
  4. 4.
    Amini, H., Fountoulakis, N.: What I tell you three times is true: bootstrap percolation in small worlds. In: 8th International Workshop on Internet and Network Economics (WINE), pp. 462–474 (2012)Google Scholar
  5. 5.
    Arvind, V., Köbler, J., Kuhnert, S., Vasudev, Y.: Approximate graph isomorphism. In: 37th International Symposium on Mathematical Foundations of Computer Science (MFCS), pp. 100–111. Springer (2012)Google Scholar
  6. 6.
    Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou r3579x? Anonymized social networks, hidden patterns, and structural steganography. Commun. ACM 54(12), 133–141 (2011)CrossRefGoogle Scholar
  7. 7.
    Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bringmann, K., Friedrich, T., Krohmer, A.: De-anonymization of heterogeneous random graphs in quasilinear time. In: 22nd European Symposium on Algorithms (ESA). Lecture Notes in Computer Science. Springer (2014)Google Scholar
  9. 9.
    Chung, F., Lu, L.: Connected components in random graphs with given expected degree sequences. Ann. Comb. 6(2), 125–145 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Proc. Natl. Acad. Sci. (PNAS) 99(25), 15879–15882 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Chung, F., Lu, L.: The average distance in a random graph with given expected degrees. Internet Math. 1(1), 91–113 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Dereich, S., Mönch, C., Mörters, P.: Typical distances in ultrasmall random networks. Adv. Appl. Prob. 44(2), 583–601 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Dwork, C.: Differential privacy. In: 33rd International Colloquium on Automata, Languages, and Programming (ICALP), pp. 1–12 (2006)Google Scholar
  14. 14.
    Fountoulakis, N., Panagiotou, K., Sauerwald, T.: Ultra-fast rumor spreading in social networks. In: 23rd Symposium Discrete Algorithms (SODA), pp. 1642–1660 (2012)Google Scholar
  15. 15.
    Friedrich, T., Krohmer, A.: Parameterized clique on scale-free networks. In: 23rd International Symposium on Algorithms and Computation (ISAAC), pp. 659–668 (2012)Google Scholar
  16. 16.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: 30th Annual ACM Symposium on Theory of Computing (STOC), pp. 604–613 (1998)Google Scholar
  17. 17.
    Kim, J.H., Vu, V.H.: Concentration of multivariate polynomials and its applications. Combinatorica 20(3), 417–434 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Korula, N., Lattanzi, S.: An efficient reconciliation algorithm for social networks. In: 40th International Conference on Very Large Data Bases (VLDB), pp. 377–388 (2014)Google Scholar
  19. 19.
    Lattanzi, S., Sivakumar, D.: Affiliation networks. In: 41st Annual ACM Symposium on Theory of Computing (STOC), pp. 427–434 (2009)Google Scholar
  20. 20.
    McGregor, A., Mironov, I., Pitassi, T., Reingold, O., Talwar, K., Vadhan, S.P.: The limits of two-party differential privacy. In: 51th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 81–90 (2010)Google Scholar
  21. 21.
    Mitzenmacher, M.: A brief history of generative models for power law and lognormal distributions. Internet Math 1(2), 226–251 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Narayanan, A., Shmatikov, V.: De-anonymizing social networks. In: 30th IEEE Symposium on Security and Privacy (SP), pp. 173–187 (2009)Google Scholar
  23. 23.
    Newman, I., Sohler, C.: Every property of hyperfinite graphs is testable. In: 43rd Annual ACM Symposium on Theory of Computing (STOC), pp. 675–684 (2011)Google Scholar
  24. 24.
    Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Novak, J., Raghavan, P., Tomkins, A.: Anti-aliasing on the web. In: 13th International Conference on World Wide Web (WWW), pp. 30–39 (2004)Google Scholar
  26. 26.
    Rao, J.R., Rohatgi, P.: Can pseudonymity really guarantee privacy? In: 9th USENIX Security Symposium (USENIX), pp. 85–96 (2000)Google Scholar
  27. 27.
    Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differentially private graph models. In: ACM SIGCOMM Conference on Internet Measurement Conference (IMC), pp. 81–98 (2011)Google Scholar
  28. 28.
    van der Hofstad, R.: Random graphs and complex networks. www.win.tue.nl/~rhofstad/NotesRGCN.pdf (2014)
  29. 29.
    Vijayraghavan, A., Wu, Y., Yoshida, Y., Zhou, Y.: Graph isomorphism: approximate and robust. Unpublished manuscript, available from the authors (2013)Google Scholar
  30. 30.
    Wondracek, G., Holz, T., Kirda, E., Kruegel, C.: A practical attack to de-anonymize social network users. In: IEEE Symposium on Security and Privacy (SP), pp. 223–238 (2010)Google Scholar
  31. 31.
    Zafarani, R., Liu, H.: Connecting corresponding identities across communities. In: 3rd International Conference on Weblogs and Social Media (ICWSM), pp. 354–357 (2009)Google Scholar
  32. 32.
    Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: 18th International Conference on World Wide Web (WWW), pp. 531–540 (2009)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.Max Planck Institute for InformaticsSaarbrückenGermany
  2. 2.Hasso Plattner InstitutePotsdamGermany

Personalised recommendations