Applied Intelligence

, Volume 48, Issue 8, pp 2470–2486 | Cite as

Link prediction in co-authorship networks based on hybrid content similarity metric

  • Pham Minh Chuan
  • Le Hoang SonEmail author
  • Mumtaz Ali
  • Tran Dinh Khang
  • Le Thanh Huong
  • Nilanjan Dey


Link prediction in online social networks is used to determine new interactions among its members which are likely to occur in the future. Link prediction in the co-authorship network has been regarded as one of the main targets in link prediction researches so far. Researchers have focused on analyzing and proposing solutions to give efficient recommendation for authors who can work together in a science project. In order to give precise prediction of links between two ubiquitous authors in a co-authorship network, it is preferable to design a similarity metric between them and then utilizing it to determine the most possible co-author(s). However, the relevant researches did not regard the integration of paper’s content in the metric itself. This is important when considering the collaboration between scientists since it is possible that authors having same research interests are more likely to have a joint paper than those in different researches. In this paper, we propose a new metric for link prediction in the co-authorship network based on the content similarity named as LDAcosin. Mathematical notions of the link prediction in the co-authorship network and a link prediction algorithm based on topic modeling are proposed. The new metric is experimentally validated on the public bibliographic collection.


Link prediction Co-authorship networks Network topology LDA Topic modeling 



This research was supported by Center for Research and Applications in Science and Technology, Hung Yen University of Technology and Education, under grant number UTEHY.T026.P1718.02.


  1. 1.
    Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230CrossRefGoogle Scholar
  2. 2.
    Akcora CG, Carminati B, Ferrari E (2011) Network and pro le based measures for user similarities on social networks. In: Proceedings of the 2011 IEEE International Conference on Information Reuse and Integration (IRI), pp 292–298Google Scholar
  3. 3.
    Akcora CG, Carminati B, Ferrari E (2013) User similarities on social networks. Soc Netw Anal Min 3 (3):475–495CrossRefGoogle Scholar
  4. 4.
    Applied Mathematics and Computation. Retrieved from Accessed on 10/07/2017
  5. 5.
    Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84CrossRefGoogle Scholar
  6. 6.
    Blei DM, Ng Andrew Y, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  7. 7.
    Blei DM, Laerty J (2009) Text mining: Theory and applications chapter topic models. Taylor and Francis, LondonGoogle Scholar
  8. 8.
    Bliss CA, Frank MR, Danforth CM, Dodds PS (2014) An evolutionary algorithm approach to link prediction in dynamic social networks. J Comput Sci 5(5):750–764MathSciNetCrossRefGoogle Scholar
  9. 9.
    Brandão M A, Moro MM, Lopes GR, Oliveira JP (2013) Using link semantics to recommend collaborations in academic social networks. In: Proceedings of the 22nd International Conference on World Wide Web, pp 833–840Google Scholar
  10. 10.
    Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27Google Scholar
  11. 11.
    Chen G (2016) Recommendation method of educational resources under the big data environment. J Comput Theor Nanosci 13(4):2582–2587CrossRefGoogle Scholar
  12. 12.
    Chuan PM, Giap CN, Son LH, Chintan B, Khang TD (2017) Enhance link prediction in online social networks using similarity metrics, sampling and classification. In: Proceedings of the 4th International Conference on Information System Design and Intelligent Applications (INDIA). (Accepted)Google Scholar
  13. 13.
    Cornell University (2016) High Energy Physics Theory. Available via Accessed on 17/10/2016
  14. 14.
    Cornell University (2016) High Energy Physics Theory. Available via . Accessed on 17/10/2016)
  15. 15.
    Dai T, Zhu L, Cai X, Pan S, Yuan S (2017) Explore semantic topics and author communities for citation recommendation in bipartite bibliographic network. J Ambient Intell Humanized Comput:1–19.
  16. 16.
    Guille A, Hacid H, Favre C, Zighed DA (2013) Information diffusion in online social networks: A survey. ACM SIGMOD Rec 42(2):17–28CrossRefGoogle Scholar
  17. 17.
    Günes I, Gündüz-Öüdücü S, Çataltepe Z (2016) Link prediction using time series of neighborhood-based node similarity scores. Data Min Knowl Discov 30(1):147–180MathSciNetCrossRefGoogle Scholar
  18. 18.
    Han X, Wang L, Farahbakhsh R et al (2016) CSD: A multiuser similarity metric for community recommendation in online social networks. Expert Syst Appl 53:14–26CrossRefGoogle Scholar
  19. 19.
    Ibrahim NMA, Chen L (2015) Link prediction in dynamic social networks by integrating different types of information. Appl Intell 42(4):738–750CrossRefGoogle Scholar
  20. 20.
    Ibrahim NMA, Chen L (2015) Link prediction in dynamic social networks by integrating different types of information. Appl Intell 42(4):738–750CrossRefGoogle Scholar
  21. 21.
    Kaya B, Poyraz M (2016) Unsupervised link prediction in evolving abnormal medical parameter networks. Int J Mach Learn Cybern 7(1):145–155CrossRefGoogle Scholar
  22. 22.
    Lakshmi JT, Bhavani DS (2017) Link Prediction in Temporal Heterogeneous Networks. In: Wang G, Chau M, Chen H (eds) Intelligence and Security Informatics. PAISI 2017. Lecture Notes in Computer Science, vol 10241. Springer, ChamGoogle Scholar
  23. 23.
    Liben-Nowell D, Kleinberg J (2007) The link prediction problem for social networks. J Amer Soc Inf Sci Technol 58(7):1019– 1031CrossRefGoogle Scholar
  24. 24.
    Lopes GR, Moro MM, Wives LK, De Oliveira JPM (2010) Collaboration recommendation on academic social networks. In: International Conference on Conceptual Modeling, pp 190– 199Google Scholar
  25. 25.
    Mitzenmacher M (2004) A brief history of generative models for power law and lognormal distributions. Internet Math 1(2):226–251MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Moradabadi B, Meybodi MR (2016) Link prediction based on temporal similarity metrics using continuous action set learning automata. Physica A: Stat Mech Appl 460:361–373MathSciNetCrossRefGoogle Scholar
  27. 27.
    Moradabadi B, Meybodi MR (2017) Link prediction in stochastic social networks: learning automata approach. Journal of Computational Science.
  28. 28.
    Munasinghe L, Ichise R (2011) Time aware index for link prediction in social networks. In: Proceeding of 3th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2011), Toulouse, France: 342–353Google Scholar
  29. 29.
    Murata T, Moriyasu S (2007) Link prediction of social networks based on weighted proximity measures. In: Proceedings of the IEEE/WIC/ACM international conference on In Web Intelligence, pp 85–88Google Scholar
  30. 30.
    Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102CrossRefGoogle Scholar
  31. 31.
    Newman ME (2001) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev E 64(1): 016131CrossRefGoogle Scholar
  32. 32.
    Parimi R, Caragea D (2011) Predicting friendship links in social networks using a topic modeling approach. In: Huang JZ, Cao L, Srivastava J (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science, vol 6635. Springer, Berlin, HeidelbergGoogle Scholar
  33. 33.
    Pobiedina N, Ichise R (2016) Citation count prediction as a link prediction problem. Appl Intell 44 (2):252–268CrossRefGoogle Scholar
  34. 34.
    Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 243–252Google Scholar
  35. 35.
    Salton G, Mc Gill MJ (1983) Introduction to modern information retrieval. Mc Graw-Hill, New YorkzbMATHGoogle Scholar
  36. 36.
    Sarna G, Bhatia MPS (2017) Content based approach to find the credibility of user in social networks: an application of cyberbullying. Int J Mach Learn Cybern 8(2):677–689CrossRefGoogle Scholar
  37. 37.
    Schifanella R, Barrat A, Cattuto C, Markines B, Menczer F (2010) Folks in folksonomies: social link prediction from shared metadata. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp 271–280Google Scholar
  38. 38.
    Soares PR, Prudncio RB (2013) Proximity measures for link prediction based on temporal events. Expert Syst Appl 40(16):6652–6660CrossRefGoogle Scholar
  39. 39.
    Soares PRDS, Prudncio RBC (2012) Time series based link prediction. In: Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), pp 1–7Google Scholar
  40. 40.
    Son LH, Tuan TM (2016) A cooperative semi-supervised fuzzy clustering framework for dental X-ray image segmentation. Expert Syst Appl 46:380–393CrossRefGoogle Scholar
  41. 41.
    Song HH, Cho TW, Dave V, Zhang Y, Qiu L (2009) Scalable proximity estimation and link prediction in online social networks. In: Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, pp 322–335Google Scholar
  42. 42.
    Srilatha P, Manjula R (2016) Similarity index based link prediction algorithms in social networks: a survey. J Telecommun Inf Technol 2:87–94Google Scholar
  43. 43.
    The Economist (2017) Why research papers have so many authors. Available via Accessed on 21/7/2017
  44. 44.
    Tien TN, Harper FM, Terveen L, Konstan JA (2017) User Personality and User Satisfaction with Recommender Systems. Information Systems Frontiers:1–17.
  45. 45.
    Tylenda T, Angelova R, Bedathur S (2009) Towards time-aware link prediction in evolving social networks. In: Proceedings of the 3rd workshop on social network mining and analysis, pp 1–10Google Scholar
  46. 46.
    Wang C, Satuluri V, Parthasarathy S (2007) Local probabilistic models for link prediction. In: Proceedings of the 7th ICDM IEEE International Conference on In Data Mining, pp 322–331Google Scholar
  47. 47.
    Wang P, Xu B, Wu Y, Zhou X (2015) Link prediction in social networks: the state-of-the-art. Sci China Inf Sci 58(1):1–38Google Scholar
  48. 48.
    Xia F, Chen Z, Wang W, Li J, Yang LT (2014) MVCWalker: Random walk-based most valuable collaborators recommendation exploiting academic factors. IEEE Trans Emerg Topics Comput 2(3):364–375CrossRefGoogle Scholar
  49. 49.
    Yu Q, Long C, Lv Y, Shao H, He P, Duan Z (2014) Predicting co-author relationship in medical co-authorship networks. PloS one 9(7):e101214CrossRefGoogle Scholar
  50. 50.
    Zervas P, Tsitmidelli A, Sampson DG, Chen NS, Kinshuk (2014) Studying research collaboration patterns via co-authorship analysis in the field of Tel: the case of educational technology & society journal. Educ Technol Soc 17(4):1–16Google Scholar
  51. 51.
    Zhang J, Philip SY (2014) Link prediction across heterogeneous social networks: a survey. Dissertation, University of Illinois at Chicago, US.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Hung Yen University of Technology and EducationHung YenVietnam
  2. 2.VNU University of ScienceVietnam National UniversityThanh XuanVietnam
  3. 3.University of Southern QueenslandToowoombaAustralia
  4. 4.School of Information and Communication TechnologyHanoi University of Science and TechnologyHanoiVietnam
  5. 5.Techno India College of TechnologyKolkataIndia

Personalised recommendations