User similarities on social networks

Abstract

A key problem in online social networks is the identification of user characteristics and the analysis of how these are reflected in the graph structure evolution. The basis to tackle this issue is user similarity measures. In this paper, we propose a novel user similarity measure for online social networks, which combines both network and profile similarity. Since user profile data could be missing proposed measure is complemented by a technique to infer missing items from profile of the user’s contacts. The second main contribution of this paper is an extensive performance evaluation of the proposed measures with respect to some of the most relevant measures already proposed in the literature. The performance evaluation study has been conducted on a variety of data sets (i.e., Facebook, Youtube, Epinions and DBLP data sets) to see how different scenarios and graph characteristics affect the measures’ performance.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Notes

  1. 1.

    User inputs with Facebook auto-completion and aggregation was imposed just in recent years. Before this, users could enter unstructured texts.

  2. 2.

    where |Sb| = 1 in case of non-structured items, like gender.

  3. 3.

    This is done by creating the graph using only edges established before time T.

  4. 4.

    In the table, precision is the correct inferrals over all inferrals, i.e., precision = #correct inferrals/#all inferrals.

  5. 5.

    http://dblp.uni-trier.de/xml/.

  6. 6.

    If there is no profile information, a generic list of celebrity accounts are offered.

  7. 7.

    Related videos list of a video can not be determined by the user who uploaded the video: http://support.google.com/youtube/bin/answer.py?hl=en&answer=92651

  8. 8.

    In Graph theory, triadic closure is used to refer to predictions for such graphs where two pairs of nodes have strong ties, and a weak tie among them is expected, i.e., the dashed line already exists or it is expected to be formed in future. Our experiments try to predict this edge.

  9. 9.

    Values are rounded to two decimal points.

  10. 10.

    This problem is known as the cold start problem in recommender systems.

References

  1. Adamic L, Buyukkokten O, Adar E (2003) A social network caught in the web. First Monday 8(6):6

    Article  Google Scholar 

  2. Akcora C, Carminati B, Ferrari E (2011) Network and profile based measures for user similarities on social networks. In: IEEE international conference on information reuse and integration (IRI), IEEE, pp 292–298

  3. Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 635–644

  4. Bhattacharyya P, Garg A, Wu S (2010) Analysis of user keyword similarity in online social networks. Soc Netw Anal Min 1:1–16

    Google Scholar 

  5. Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. Redox Rep 30(2):3

    Google Scholar 

  6. Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. Intell Syst IEEE 25(4):26–35

    Article  Google Scholar 

  7. Brodka P, Saganowski S, Kazienko P (2012) Ged: the method for group evolution discovery in social networks. Soc Netw Anal Min 1:1–14. doi:10.1007/s13278-012-0058-8

  8. Cheng X, Liu J (2009) Nettube: exploring social networks for peer-to-peer short video sharing. In: INFOCOM 2009, IEEE, IEEE, pp 1152–1160

  9. Cheng X, Dale C, Liu J (2008) Statistics and social network of youtube videos. In: Quality of service, 2008. IWQoS 2008. 16th International Workshop on, IEEE, pp 229–238

  10. Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, New York

  11. Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst 22:143–177

    Article  Google Scholar 

  12. Guha R, Kumar R, Raghavan P, Tomkins A (2004) Propagation of trust and distrust. In: Proceedings of the 13th international conference on World Wide Web, ACM, New York, NY, USA, WWW ’04, pp 403–412. doi:10.1145/988672.988727

  13. Krishnamurthy B, Wills C (2010) On the leakage of personally identifiable information via online social networks. ACM SIGCOMM 40(1):112–117

    Article  Google Scholar 

  14. Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58(7):1019–1031

    Article  Google Scholar 

  15. Lindamood J, Heatherly R, Kantarcioglu M, Thuraisingham B (2009) Inferring private information using social network data. In: Proceedings of the 18th WWW, ACM, pp 1145–1146

  16. Massa P, Avesani P (2004) Trust-aware bootstrapping of recommender systems. In: Proceedings of ECAI 2006: workshop on recommender systems, Citeseer, pp 29–33

  17. McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27:415–444

    Article  Google Scholar 

  18. Mislove A, Viswanath B, Gummadi K, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Proceedings of the third WSDM conference, ACM, pp 251–260

  19. Mueller D (1976) Public choice: a survey. J Econ Literature 14(2):395–433

    Google Scholar 

  20. Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026,113

    Article  Google Scholar 

  21. Penrose L (1946) The elementary statistics of majority voting. J Roy Stat Soc B 109(1):53–57

    Article  Google Scholar 

  22. Scott J (2011) Social network analysis: developments, advances, and prospects. Soc Netw Anal Min 1:21–26. doi:10.1007/s13278-010-0012-6

  23. Spertus E, Sahami M, Buyukkokten O (2005) Evaluating similarity measures: a large-scale study in the orkut social network. In: Proceedings of the 11th SIGKDD, ACM, pp 678–684

  24. Viswanath B, Mislove A, Cha M, Gummadi KP (2009) On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM SIGCOMM workshop on social networks (WOSN’09)

  25. Zhou R, Khemmarat S, Gao L (2010) The impact of youtube recommendation system on video views. In: Proceedings of the 10th annual conference on Internet measurement, ACM, pp 404–410

  26. Zhou Y, Cheng H, Yu J (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endowment 2(1):718–729

    Google Scholar 

Download references

Acknowledgments

The research presented in this paper was partially funded by a Google Research Award.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Cuneyt Gurcan Akcora.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Akcora, C.G., Carminati, B. & Ferrari, E. User similarities on social networks. Soc. Netw. Anal. Min. 3, 475–495 (2013). https://doi.org/10.1007/s13278-012-0090-8

Download citation

Keywords

  • Profile Similarity
  • Target User
  • Network Similarity
  • Profile Information
  • Social Graph