Abstract
Many applications of social networks require relationship anonymity due to the sensitive, stigmatizing, or confidential nature of relationship. Recent work showed that the simple technique of anonymizing graphs by replacing the identifying information of the nodes with random IDs does not guarantee privacy since the identification of the nodes can be seriously jeopardized by applying subgraph queries. In this paper, we investigate how well an edge-based graph randomization approach can protect sensitive links. We show via theoretical studies and empirical evaluations that various similarity measures can be exploited by attackers to significantly improve their confidence and accuracy of predicted sensitive links between nodes with high similarity values. We also compare our similarity measure-based prediction methods with the low-rank approximation-based prediction in this paper.
Similar content being viewed by others
References
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3): 211–230
Agrawal D, Agrawal C (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th symposium on principles of database systems
Agrawal R, Srikant R (2000) Privacy preserving data mining. In: Proceedings of the ACM SIGMOD international conference on management of data. Texas, Dallas, pp 439–450
Backstrom L, Dwork C, Kleinberg J (2007) Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: WWW ’07: proceedings of the 16th international conference on World Wide Web. ACM Press, New York, pp 181–190
Campan A, Truta TM (2008) A clustering approach for data and structural anonymity in social networks In: Proceedings of the 2nd ACM SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD08)
Das S, Egecioglu Ömer, Abbadi AE (2009) Anonymizing edge-weighted social network graphs, Technical report, UCSB CS
Gkoulalas-Divanis A, Verykios VS (2009) Hiding sensitive knowledge without side effects. Knowl Inf Syst 20(3): 263–299
Guo S, Wu X, Li Y (2008) Determining error bounds for spectral filtering based reconstruction methods in privacy preserving data mining. Knowl Inf Syst 17(2): 217–240
Hanhijarvi S, Garriga GC, Puolamaki K (2009) Randomization techniques for graphs. In: Proceedings of the 9th SIAM conference on data mining
Hay M, Miklau G, Jensen D, Towsely D, Weis P (2008) Resisting structural re-identification in anonymized social networks. In: VLDB
Hay M, Miklau G, Jensen D, Weis P, Srivastava S (2007) Anonymizing social networks. University of Massachusetts Technical Report 07-19
Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: Proceedings of the ACM SIGMOD conference on management of data. Baltimore, MA
Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd international conference on data mining. pp 99–106
Kargupta H, Datta S, Wang Q, Sivakumar K (2005) Random-data perturbation techniques and privacy-preserving data mining. Knowl Inf Syst 7(4): 387–414
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1): 39–43
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: ‘CIKM ’03: proceedings of the twelfth international conference on information and knowledge management. ACM, New York, pp 556–559
Liu K, Terzi E (2008) Towards identity anonymization on graphs. In: Proceedings of the ACM SIGMOD conference. ACM Press, Vancouver, Canada
Liu L, Wang J, Liu J, Zhang J (2009) Privacy preservation in social networks with sensitive edge weights. In: SDM. pp 954–965
Lovasz L (1993) Random walks on graphs. Combinatorics 2: 1–46
Luo H, Fan J, Lin X, Zhou A, Bertino E (2009) A distributed approach to enabling privacy-preserving model-based classifier training. Knowl Inf Syst 20(2): 157–185
Shetty J, Adibi J (2004) The Enron email dataset database schema and brief statistical report. Information sciences institute technical report, University of Southern California
Teng Z, Du W (2009) A hybrid multi-group approach for privacy-preserving data mining. Knowl Inf Syst 19(2): 133–157
Wu L, Ying X, Wu X (2010) Reconstruction from randomized graph via low rank approximation. In: Proceedings of the 10th SIAM conference on data mining
Ying X, Pan K, Wu X, Guo L (2009) Comparisons of randomization and k-degree anonymization schemes for privacy preserving social network publishing. In: SNA-KDD ’09: proceedings of the 3rd SIGKDD workshop on social network mining and analysis
Ying X, Wu X (2008) Randomizing social networks: a spectrum preserving approach. In: Proceedings of the 8th SIAM conference on data mining
Ying X, Wu X (2009a) Graph generation with prescribed feature constraints. In: Proceedings of the 9th SIAM conference on data mining
Ying X, Wu X (2009b) On link privacy in randomizing social networks. In: Proceedings of the 13th Pacific-Asia conference on knowledge discovery and data mining
Zheleva E, Getoor L (2007) Preserving the privacy of sensitive relationships in graph data. In: Proceedings of the 1st ACM SIGKDD international workshop on privacy, security, and trust in KDD (PinKDD07). pp 153–171
Zhou B, Pei J (2008) Preserving privacy in social networks against neighborhood attacks. IEEE 24th international conference on data engineering. pp 506–515
Zou L, Chen L, Özsu MT (2009) K-automorphism: a general framework for privacy preserving network publication. In: Proceedings of 35th international conference on very large data base
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ying, X., Wu, X. On link privacy in randomizing social networks. Knowl Inf Syst 28, 645–663 (2011). https://doi.org/10.1007/s10115-010-0353-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-010-0353-5