Neighborhood randomization for link privacy in social network analysis

Abstract

Social network analysis has many important applications but it depends on sharing and publishing the underlying graph. Link privacy requires limiting the ability of an adversary to infer the presence of a sensitive link between two individuals in the published social network graph. A standard technique for achieving link privacy is to probabilistically randomize a link over the space for node pairs. A major drawback of such graph-wise randomization is that it ignores the structural proximity of nodes, thus, alters considerably the structure of social networks and distorts the accuracy of social network analysis. To address this problem, we propose a structure-aware randomization scheme, called neighborhood randomization. This scheme models a social network as a directed graph and probabilistically randomizes the destination of a link within a local neighborhood. By confining the randomization to a local neighborhood, this scheme drastically reduces the distortion to the graph structure yet hides a sensitive link. The trade-off between privacy and utility is dictated by the retention probability of a destination and by the size of the randomization neighborhood. We conduct extensive experiments to evaluate this trade-off using real life social network data.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving OLAP. In: SIGMOD (2005)

  2. 2.

    Backstrom, L., Dwork, C., Kleinberg, J.M.: Wherefore art thou R3579X?: anonymized social networks, hidden patterns, and structural steganography. In: WWW (2007)

  3. 3.

    Bai, K., Liu, Y., Liu, P.: Prevent identity disclosure in social network data study. In: ACM CCS (2009)

  4. 4.

    Barabasi, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)

    Article  MathSciNet  Google Scholar 

  5. 5.

    Borgatti, S.P., Everett, M.G., Freeman, L.C.: Ucinet for Windows: Software for Social Network Analysis. Analytic Technologies, Harvard, MA (2002)

  6. 6.

    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. In: WWW (1998)

  7. 7.

    Campan, A., Truta, T.: A clustering approach for data and structural anonymity in social networks. In: PinKDD (2008)

  8. 8.

    Cheng, J., Fu, A.W., Liu, J.: K-isomorphism: privacy preserving network publication against structural attacks. In: SIGMOD (2010)

  9. 9.

    Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: PODS (2003)

  10. 10.

    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. SIAM J. Discret. Math. 17(1), 134–160 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  11. 11.

    Fung, B.C.M., Wang, K., Fu, A.W.-C., Yu, P.S.: Introduction to privacy-preserving data publishing: concepts and techniques. In: Data Mining and Knowledge Discovery Series. Chapman & Hall/CRC (2010)

  12. 12.

    Guimera, R., Danon, L., Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Phys. Rev. 68, 065103 (2003)

    Google Scholar 

  13. 13.

    Hay, M., Miklau, G., Jensen, D., Towsley, D., Weis, P.: Resisting structural reidentification in anonymized social networks. In: VLDB (2008)

  14. 14.

    Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing social networks. Technical report, University of Massachusetts Amherst (2007)

  15. 15.

    Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: CIKM (2003)

  16. 16.

    Litvak, N., Scheinhardt, W.R.W., Volkovich, Y.: In-degree and PageRank: why do they follow similar power laws? Internet Math. 4(2), 175–198 (2007)

    Article  MathSciNet  Google Scholar 

  17. 17.

    Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: ACM SIGMOD/PODS (2008)

  18. 18.

    Medforth, N., Wang, K.: Privacy risk in graph stream publishing for social network data. In: ICDM (2011)

  19. 19.

    Milani Fard, A., Wang, K., Yu, P.S.: Limiting link disclosure in social network analysis through subgraph-wise perturbation. In: EDBT (2012)

  20. 20.

    Musiał, K., Kazienko, P.: Social networks on the Internet. WWWJ 16(1), 31–72 (2013)

    Article  Google Scholar 

  21. 21.

    Newman, M.E.J.: The structure of scientific collaboration networks. In: Proc. of the National Academy of Sciences of the USA, vol. 98, issue 2 (2001)

  22. 22.

    Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press (1999)

  23. 23.

    Wong, R.C., Fu, A.W., Wang, K., Pei, J.: Minimality attack in privacy preserving data publishing. In: VLDB (2007)

  24. 24.

    Ying, X., Wu, X.: Randomizing social networks: a spectrum preserving approach. In: SDM (2008)

  25. 25.

    Zhang, L., Zhang, W.: Edge anonymity in social network graphs. In: IEEE Social Computing (2009)

  26. 26.

    Zheleva, E., Getoor, L.: Preserving the privacy of sensitive relationships in graph data. In: PinKDD (2007)

  27. 27.

    Zhou, D., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: ICDE (2008)

  28. 28.

    Zou, L., Chen, L., Ozsu, M.T.: K-automorphism: a general framework for privacy preserving network publication. In: VLDB (2009)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Amin Milani Fard.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Milani Fard, A., Wang, K. Neighborhood randomization for link privacy in social network analysis. World Wide Web 18, 9–32 (2015). https://doi.org/10.1007/s11280-013-0240-6

Download citation

Keywords

  • Social network analysis
  • Privacy-preserving data publishing
  • Link perturbation