Predicting interval time for reciprocal link creation using survival analysis

Abstract

The majority of directed social networks, such as Twitter, Flickr and Google+, exhibit reciprocal altruism, a social psychology phenomenon, which drives a vertex to create a reciprocal link with another vertex which has created a directed link toward the former. In existing works, scientists have already predicted the possibility of the creation of reciprocal link—a task known as “reciprocal link prediction”. However, an equally important problem is determining the interval time between the creation of the first link (also called parasocial link) and its corresponding reciprocal link. No existing works have considered solving this problem, which is the focus of this paper. Predicting the reciprocal link interval time is a challenging problem for two reasons: First, there is a lack of effective features, since well-known link prediction features are designed for undirected networks and for the binary classification task; hence, they do not work well for the interval time prediction; Second, the presence of ever-waiting links (i.e., parasocial links for which a reciprocal link is not formed within the observation period) makes the traditional supervised regression methods unsuitable for such data. In this paper, we propose a solution for the reciprocal link interval time prediction task. We map this problem to a survival analysis task and show through extensive experiments on real-world datasets that survival analysis methods perform better than traditional regression, neural network-based models and support vector regression for solving reciprocal interval time prediction.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    http://konect.uni-koblenz.de/networks/.

  2. 2.

    This is Manufacturing Company email dataset available from R. Michalski’s website, https://www.ii.pwr.edu.pl/~michalski.

  3. 3.

    https://cran.r-project.org/web/packages/survival/index.html.

  4. 4.

    https://cran.r-project.org/web/packages/bujar/index.html.

References

  1. Adalı S, Golbeck J (2014) Predicting personality with social behavior: a comparative study. Soc Netw Anal Min 4(1):159

    Article  Google Scholar 

  2. Akaike H (1998) Information theory and an extension of the maximum likelihood principle. Springer, New York, pp 199–213

    Google Scholar 

  3. Anand S, Chandramouli R, Subbalakshmi KP, Venkataraman M (2013) Altruism in social networks: good guys do finish first. Soc Netw Anal Min 3(2):167–177

    Article  Google Scholar 

  4. Bogdanov P, Busch M, Moehlis J, Singh AK, Szymanski BK (2014) Modeling individual topic-specific behavior and influence backbone networks in social media. Soc Netw Anal Min 4(1):204

    Article  Google Scholar 

  5. Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2011) Mute: Majority under-sampling technique. In: International conference on information communications and signal processing (ICICS), IEEE, pp 1–4

  6. Cheng J, Romero DM, Meeder B, Kleinberg J (2011) Predicting reciprocity in social networks. In: IEEE 3rd international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE 3rd inernational conference on social computing (SocialCom), 2011, pp 49–56

  7. Chierichetti F, Kumar R, Lattanzi S, Mitzenmacher M, Panconesi A, Raghavan P (2009) On compressing social networks. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 219–228

  8. Cox DR (1972) Regression models and life-tables. J Roy Stat Soc Ser B (Methodol) 34(2):187–220

    MathSciNet  MATH  Google Scholar 

  9. Dave VS, Hasan MA (2015) Topcom: Index for shortest distance query in directed graph. Database and expert systems applications. Springer, Cham, pp 471–480

    Google Scholar 

  10. Dave VS, Hasan MA (2016) Topcom: index for shortest distance query in directed graph. CoRR abs/1602.01537, http://arxiv.org/abs/1602.01537

  11. Dave VS, Al Hasan M, Reddy CK (2017) How fast will you get a response? predicting interval time for reciprocal link creation. In: AAAI international conference on web and social media (ICWSM)

  12. Devineni P, Koutra D, Faloutsos M, Faloutsos C (2017) Facebook wall posts: a model of user behaviors. Soc Netw Anal Min 7(1):6

    Article  Google Scholar 

  13. Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14, pp 601–610

  14. Dong Y, Tang J, Wu S, Tian J, Chawla NV, Rao J, Cao H (2012) Link prediction and recommendation across heterogeneous social networks. In: Proceedings of the IEEE 12th international conference on data mining, ICDM ’12, pp 181–190

  15. Dumba B, Golnari G, Zhang ZL (2016) Analysis of a reciprocal network using Google+: structural properties and evolution. Springer, Berlin, pp 14–26

    Google Scholar 

  16. Esslimani I, Brun A, Boyer A (2011) Densifying a behavioral recommender system by social networks link prediction methods. Soc Netw Anal Min 1(3):159–172

    Article  Google Scholar 

  17. Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. ACM SIGCOMM Comput Commun Rev 29:251–262

    MATH  Article  Google Scholar 

  18. Feng X, Zhao J, Fang Z, Xu K (2014) Time-aware reciprocity prediction in trust network. In: Advances in social networks analysis and mining (ASONAM), pp 234–237

  19. Gong NZ, Xu W (2014) Reciprocal versus parasocial relationships in online social networks. Soc Netw Anal Min 4(1):1–14

    Article  Google Scholar 

  20. Hasan MA, Zaki MJ (2011) A survey of link prediction in social networks. In: Agaarwal CC (ed) Social network data analytics. Springer, New York

    Google Scholar 

  21. Hasan MA, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: In Proceedings of the SDM 06 workshop on link analysis, counterterrorism and security

  22. Hopcroft J, Lou T, Tang J (2011) Who will follow you back?: Reciprocal relationship prediction. In: Proceedings of the CIKM, pp 1137–1146

  23. Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481. http://www.jstor.org/stable/2281868

  24. Kuhnt MR, Brust OA (2014) Low reciprocity rates in acquaintance networks of young adults: fact or artifact? Soc Netw Anal Min 4(1):167

    Article  Google Scholar 

  25. Leider S, Mobius MM, Rosenblat T, Do QA (2007) How much is a friend worth? directed altruism and enforced reciprocity in social networks. Revision of NBER working paper 13135, Cambridge, Mass, National Bureau of Economics Research

  26. Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data (TKDD) 1(1). https://doi.org/10.1145/1217299.1217301

  27. Li M, Jia Y, Wang Y, Zhao Z, Cheng X (2016) Predicting links and their building time: a path-based approach. In: Proceedings of the 30th AAAI conference on artificial intelligence, AAAI Press, AAAI’16, pp 4228–4229

  28. Liaghat Z, Rasekh AH, Mahdavi A (2013) Application of data mining methods for link prediction in social networks. Soc Netw Anal Min 3(2):143–150

    Article  Google Scholar 

  29. Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proceedings of the CIKM, pp 556–559

  30. Nurcan Durak APCS, Kolda Tamara G (2013) A scalable null model for directed graphs matching all degree distributions: In, out, and reciprocal. IEEE workshop on network science. IEEE Press, Piscataway, NJ, pp 23–30

  31. Pencina MJ, D’Agostino RB (2004) Overall-c as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med 23(13):2109–2123

    Article  Google Scholar 

  32. Roshanaei M, Mishra S (2015) Studying the attributes of users in twitter considering their emotional states. Social Network Analysis and Mining 5(1):34

    Article  Google Scholar 

  33. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    MathSciNet  MATH  Article  Google Scholar 

  34. Shahriari M, Sichani OA, Gharibshah J, Jalili M (2016) Sign prediction in social networks based on users reputation and optimism. Soc Netw Anal Min 6(1):91

    Article  Google Scholar 

  35. Song D, Meyer DA (2015) Link sign prediction and ranking in signed directed social networks. Soc Netw Anal Min 5(1):52

    Article  Google Scholar 

  36. Sun Y, Barber R, Gupta M, Aggarwal CC, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: International conference on advances in social networks analysis and mining, pp 121–128

  37. Sun Y, Han J, Aggarwal CC, Chawla NV (2012) When will it happen?: relationship prediction in heterogeneous information networks. In: ACM international conference on web search and data mining, WSDM, pp 663–672

  38. Symeonidis P, Mantas N (2013) Spectral clustering for link prediction in social networks with positive and negative links. Soc Netw Anal Min 3(4):1433–1447

    Article  Google Scholar 

  39. Trivers RL (1971) The evolution of reciprocal altruism. Q Rev Biol 46:33–57

    Article  Google Scholar 

  40. Tu K, Ribeiro B, Jensen D, Towsley D, Liu B, Jiang H, Wang X (2014) Online dating recommendations: matching markets and learning preferences. In: Proceedings of the 23rd international conference on world wide web, ACM, New York, NY, USA, WWW ’14 Companion, pp 787–792

  41. Tuna T, Akbas E, Aksoy A, Canbaz MA, Karabiyik U, Gonen B, Aygun R (2016) User characterization for online social networks. Soc Netw Anal Min 6(1):104

    Article  Google Scholar 

  42. Valverde-Rebaza J, de Andrade Lopes A (2013) Exploiting behaviors of communities of twitter users for link prediction. Soc Netw Anal Min 3(4):1063–1074

    Article  Google Scholar 

  43. Vinzamuri B, Reddy CK (2013) Cox regression with correlation based regularization for electronic health records. In: International conference on data mining (ICDM), pp 757–766

  44. Wang P, Li Y, Reddy CK (2017a) Machine learning for survival analysis: a survey. ACM Computing Surveys

  45. Wang Z, Wang C (2010) Buckley-James boosting for survival analysis with high-dimensional biomarker data. Stat Appl Genet Mol Biol 9(1):1–31

    MathSciNet  MATH  Article  Google Scholar 

  46. Wang Z, Chen C, Li W (2017b) Predictive network representation learning for link prediction. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’17, pp 969–972

  47. Xia P, Ribeiro B, Chen C, Liu B, Towsley D (2013) A study of user behavior on an online dating site. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013), pp 243–247

  48. Xia P, Liu B, Sun Y, Chen C (2015) Reciprocal recommendation system for online dating. In: IEEE/ACM international conference on advances in social networks analysis and mining, ACM, ASONAM ’15, pp 234–241

  49. Xia P, Zhai S, Liu B, Sun Y, Chen C (2016) Design of reciprocal recommendation systems for online dating. Soc Netw Anal Min 6(1):32

    Article  Google Scholar 

  50. Yang Y, Zou H (2013) A cocktail algorithm for solving the elastic net penalized cox regression in high dimensions. Stat Interface 6(2):167–173

    MathSciNet  MATH  Article  Google Scholar 

  51. Zang X, Yamasaki T, Aizawa K, Nakamoto T, Kuwabara E, Egami S, Fuchida Y (2017) You will succeed or not? matching prediction in a marriage consulting service. In: 2017 IEEE 3rd international conference on multimedia big data (BigMM), pp 109–116

  52. Zhang B, Choudhury S, Hasan MA, Ning X, Agarwal K, Purohit S, Cabrera PGP (2016) Trust from the past: Bayesian personalized ranking based link prediction in knowledge graphs. In: SDM workshop on mining networks and graphs (MNG 2016)

  53. Zhao K, Wang X, Yu M, Gao B (2014) User recommendations in reciprocal and bipartite social networks-an online dating case study. IEEE Intell Syst 29(2):27–35

    Article  Google Scholar 

  54. Zhu YX, Zhang XG, Sun GQ, Tang M, Zhou T, Zhang ZK (2014) Influence of reciprocal links in social networks. PloS One 9(7):e103007

    Article  Google Scholar 

  55. Zlatić V, Štefančić H (2009) Influence of reciprocal edges on degree distribution and degree correlations. Phys Rev E 80(1):016117

    Article  Google Scholar 

  56. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67:301–320

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgements

This research is supported by National Science Foundation (NSF) career award (IIS-1149851) and in part by the NSF Grants IIS-1707498, IIS-1619028 and IIS-1646881.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Vachik S. Dave.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (txt 31 KB)

Supplementary material 2 (pdf 104 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dave, V.S., Hasan, M.A., Zhang, B. et al. Predicting interval time for reciprocal link creation using survival analysis. Soc. Netw. Anal. Min. 8, 16 (2018). https://doi.org/10.1007/s13278-018-0494-1

Download citation

Keywords

  • Link prediction
  • Directed network
  • Reciprocity
  • Time prediction
  • Survival analysis