Skip to main content

Link prediction in co-authorship networks based on hybrid content similarity metric

Abstract

Link prediction in online social networks is used to determine new interactions among its members which are likely to occur in the future. Link prediction in the co-authorship network has been regarded as one of the main targets in link prediction researches so far. Researchers have focused on analyzing and proposing solutions to give efficient recommendation for authors who can work together in a science project. In order to give precise prediction of links between two ubiquitous authors in a co-authorship network, it is preferable to design a similarity metric between them and then utilizing it to determine the most possible co-author(s). However, the relevant researches did not regard the integration of paper’s content in the metric itself. This is important when considering the collaboration between scientists since it is possible that authors having same research interests are more likely to have a joint paper than those in different researches. In this paper, we propose a new metric for link prediction in the co-authorship network based on the content similarity named as LDAcosin. Mathematical notions of the link prediction in the co-authorship network and a link prediction algorithm based on topic modeling are proposed. The new metric is experimentally validated on the public bibliographic collection.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

References

  1. Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230

    Article  Google Scholar 

  2. Akcora CG, Carminati B, Ferrari E (2011) Network and pro le based measures for user similarities on social networks. In: Proceedings of the 2011 IEEE International Conference on Information Reuse and Integration (IRI), pp 292–298

  3. Akcora CG, Carminati B, Ferrari E (2013) User similarities on social networks. Soc Netw Anal Min 3 (3):475–495

    Article  Google Scholar 

  4. Applied Mathematics and Computation. Retrieved from http://www.sciencedirect.com/science/journal/00963003?sdc=1. Accessed on 10/07/2017

  5. Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84

    Article  Google Scholar 

  6. Blei DM, Ng Andrew Y, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  7. Blei DM, Laerty J (2009) Text mining: Theory and applications chapter topic models. Taylor and Francis, London

    Google Scholar 

  8. Bliss CA, Frank MR, Danforth CM, Dodds PS (2014) An evolutionary algorithm approach to link prediction in dynamic social networks. J Comput Sci 5(5):750–764

    MathSciNet  Article  Google Scholar 

  9. Brandão M A, Moro MM, Lopes GR, Oliveira JP (2013) Using link semantics to recommend collaborations in academic social networks. In: Proceedings of the 22nd International Conference on World Wide Web, pp 833–840

  10. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  11. Chen G (2016) Recommendation method of educational resources under the big data environment. J Comput Theor Nanosci 13(4):2582–2587

    Article  Google Scholar 

  12. Chuan PM, Giap CN, Son LH, Chintan B, Khang TD (2017) Enhance link prediction in online social networks using similarity metrics, sampling and classification. In: Proceedings of the 4th International Conference on Information System Design and Intelligent Applications (INDIA). (Accepted)

  13. Cornell University (2016) High Energy Physics Theory. Available via https://arxiv.org/archive/hep-th/. Accessed on 17/10/2016

  14. Cornell University (2016) High Energy Physics Theory. Available via https://arxiv.org/archive/hep-lat . Accessed on 17/10/2016)

  15. Dai T, Zhu L, Cai X, Pan S, Yuan S (2017) Explore semantic topics and author communities for citation recommendation in bipartite bibliographic network. J Ambient Intell Humanized Comput:1–19. https://doi.org/10.1007/s12652-017-0497-1

  16. Guille A, Hacid H, Favre C, Zighed DA (2013) Information diffusion in online social networks: A survey. ACM SIGMOD Rec 42(2):17–28

    Article  Google Scholar 

  17. Günes I, Gündüz-Öüdücü S, Çataltepe Z (2016) Link prediction using time series of neighborhood-based node similarity scores. Data Min Knowl Discov 30(1):147–180

    MathSciNet  Article  Google Scholar 

  18. Han X, Wang L, Farahbakhsh R et al (2016) CSD: A multiuser similarity metric for community recommendation in online social networks. Expert Syst Appl 53:14–26

    Article  Google Scholar 

  19. Ibrahim NMA, Chen L (2015) Link prediction in dynamic social networks by integrating different types of information. Appl Intell 42(4):738–750

    Article  Google Scholar 

  20. Ibrahim NMA, Chen L (2015) Link prediction in dynamic social networks by integrating different types of information. Appl Intell 42(4):738–750

    Article  Google Scholar 

  21. Kaya B, Poyraz M (2016) Unsupervised link prediction in evolving abnormal medical parameter networks. Int J Mach Learn Cybern 7(1):145–155

    Article  Google Scholar 

  22. Lakshmi JT, Bhavani DS (2017) Link Prediction in Temporal Heterogeneous Networks. In: Wang G, Chau M, Chen H (eds) Intelligence and Security Informatics. PAISI 2017. Lecture Notes in Computer Science, vol 10241. Springer, Cham

  23. Liben-Nowell D, Kleinberg J (2007) The link prediction problem for social networks. J Amer Soc Inf Sci Technol 58(7):1019– 1031

    Article  Google Scholar 

  24. Lopes GR, Moro MM, Wives LK, De Oliveira JPM (2010) Collaboration recommendation on academic social networks. In: International Conference on Conceptual Modeling, pp 190– 199

  25. Mitzenmacher M (2004) A brief history of generative models for power law and lognormal distributions. Internet Math 1(2):226–251

    MathSciNet  Article  MATH  Google Scholar 

  26. Moradabadi B, Meybodi MR (2016) Link prediction based on temporal similarity metrics using continuous action set learning automata. Physica A: Stat Mech Appl 460:361–373

    MathSciNet  Article  Google Scholar 

  27. Moradabadi B, Meybodi MR (2017) Link prediction in stochastic social networks: learning automata approach. Journal of Computational Science. https://doi.org/10.1016/j.jocs.2017.08.007

  28. Munasinghe L, Ichise R (2011) Time aware index for link prediction in social networks. In: Proceeding of 3th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2011), Toulouse, France: 342–353

  29. Murata T, Moriyasu S (2007) Link prediction of social networks based on weighted proximity measures. In: Proceedings of the IEEE/WIC/ACM international conference on In Web Intelligence, pp 85–88

  30. Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102

    Article  Google Scholar 

  31. Newman ME (2001) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev E 64(1): 016131

    Article  Google Scholar 

  32. Parimi R, Caragea D (2011) Predicting friendship links in social networks using a topic modeling approach. In: Huang JZ, Cao L, Srivastava J (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science, vol 6635. Springer, Berlin, Heidelberg

  33. Pobiedina N, Ichise R (2016) Citation count prediction as a link prediction problem. Appl Intell 44 (2):252–268

    Article  Google Scholar 

  34. Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 243–252

  35. Salton G, Mc Gill MJ (1983) Introduction to modern information retrieval. Mc Graw-Hill, New York

    MATH  Google Scholar 

  36. Sarna G, Bhatia MPS (2017) Content based approach to find the credibility of user in social networks: an application of cyberbullying. Int J Mach Learn Cybern 8(2):677–689

    Article  Google Scholar 

  37. Schifanella R, Barrat A, Cattuto C, Markines B, Menczer F (2010) Folks in folksonomies: social link prediction from shared metadata. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp 271–280

  38. Soares PR, Prudncio RB (2013) Proximity measures for link prediction based on temporal events. Expert Syst Appl 40(16):6652–6660

    Article  Google Scholar 

  39. Soares PRDS, Prudncio RBC (2012) Time series based link prediction. In: Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), pp 1–7

  40. Son LH, Tuan TM (2016) A cooperative semi-supervised fuzzy clustering framework for dental X-ray image segmentation. Expert Syst Appl 46:380–393

    Article  Google Scholar 

  41. Song HH, Cho TW, Dave V, Zhang Y, Qiu L (2009) Scalable proximity estimation and link prediction in online social networks. In: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, pp 322–335

  42. Srilatha P, Manjula R (2016) Similarity index based link prediction algorithms in social networks: a survey. J Telecommun Inf Technol 2:87–94

  43. The Economist (2017) Why research papers have so many authors. Available via https://www.economist.com/news/science-and-technology/21710792-scientific-publications-are-getting-more-and-more-names-attached-them-why. Accessed on 21/7/2017

  44. Tien TN, Harper FM, Terveen L, Konstan JA (2017) User Personality and User Satisfaction with Recommender Systems. Information Systems Frontiers:1–17. https://doi.org/10.1007/s10796-017-9782-y

  45. Tylenda T, Angelova R, Bedathur S (2009) Towards time-aware link prediction in evolving social networks. In: Proceedings of the 3rd workshop on social network mining and analysis, pp 1–10

  46. Wang C, Satuluri V, Parthasarathy S (2007) Local probabilistic models for link prediction. In: Proceedings of the 7th ICDM IEEE International Conference on In Data Mining, pp 322–331

  47. Wang P, Xu B, Wu Y, Zhou X (2015) Link prediction in social networks: the state-of-the-art. Sci China Inf Sci 58(1):1–38

    Google Scholar 

  48. Xia F, Chen Z, Wang W, Li J, Yang LT (2014) MVCWalker: Random walk-based most valuable collaborators recommendation exploiting academic factors. IEEE Trans Emerg Topics Comput 2(3):364–375

    Article  Google Scholar 

  49. Yu Q, Long C, Lv Y, Shao H, He P, Duan Z (2014) Predicting co-author relationship in medical co-authorship networks. PloS one 9(7):e101214

    Article  Google Scholar 

  50. Zervas P, Tsitmidelli A, Sampson DG, Chen NS, Kinshuk (2014) Studying research collaboration patterns via co-authorship analysis in the field of Tel: the case of educational technology & society journal. Educ Technol Soc 17(4):1–16

    Google Scholar 

  51. Zhang J, Philip SY (2014) Link prediction across heterogeneous social networks: a survey. Dissertation, University of Illinois at Chicago, US.

Download references

Acknowledgments

This research was supported by Center for Research and Applications in Science and Technology, Hung Yen University of Technology and Education, under grant number UTEHY.T026.P1718.02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le Hoang Son.

Appendix

Appendix

Source codes and datasets of the research can be found at this link: https://sourceforge.net/p/affin-ldacosin/code/ci/master/tree/

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chuan, P.M., Son, L.H., Ali, M. et al. Link prediction in co-authorship networks based on hybrid content similarity metric. Appl Intell 48, 2470–2486 (2018). https://doi.org/10.1007/s10489-017-1086-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1086-x

Keywords

  • Link prediction
  • Co-authorship networks
  • Network topology
  • LDA
  • Topic modeling