Skip to main content
Log in

Toward a better scientific collaboration success prediction model through the feature space expansion

  • Published:
Scientometrics Aims and scope Submit manuscript

An Erratum to this article was published on 10 November 2016


The problem with the prediction of scientific collaboration success based on the previous collaboration of scholars using machine learning techniques is addressed in this study. As the exploitation of collaboration network is essential in collaborator discovery systems, in this article an attempt is made to understand how to exploit the information embedded in collaboration networks. We benefit the link structure among the scholars and also among the scholars and the concepts to extract set of features that are correlated with the collaboration success and increase the prediction performance. The effect of considering other aggregate methods in addition to average and maximum, for computing the collaboration features based on the feature of the members is examined as well. A dataset extracted from Northwestern University’s SciVal Expert is used for evaluating the proposed approach. The results demonstrate the capability of the proposed collaboration features in order to increase the prediction performance in combination with the widely-used features like h-index and average citation counts. Consequently, the introduced features are appropriate to incorporate in collaborator discovery systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others


  1. Computer supported cooperative work.




  5. Medical Subject Heading.


  7. Scival Expert assigns a unique identifier (uid) to each scholar.


  9. Multilayer Perceptron.


  • Abbasi, A., Wigand, R. T., & Hossain, L. (2014). Measuring social capital through network analysis and its influence on individual performance. Library & Information Science Research, 36(1), 66–73.

    Article  Google Scholar 

  • Awal, G. K., & Bharadwaj, K. (2014). Team formation in social networks based on collective intelligence-an evolutionary approach. Applied Intelligence, 41(2), 627–648.

    Article  Google Scholar 

  • Bennett, L. M., & Gadlin, H. (2012). Collaboration and team science. Journal of Investigative Medicine, 60(5), 768–775.

    Article  Google Scholar 

  • Börner, K., Contractor, N., Falk-Krzesinski, H.J., Fiore, S.M., Hall, K.L., Keyton, J., Spring, B., Stokols, D., Trochim, W., Uzzi, B. (2010). A multi-level systems perspective for the science of team science. Science Translational Medicine 2(49), 49cm24–49cm24.

  • Bozeman, B., Fay, D., & Slade, C. P. (2013). Research collaboration in universities and academic entrepreneurship: The-state-of-the-art. The Journal of Technology Transfer, 38(1), 1–67.

    Article  Google Scholar 

  • Callaham, M., Wears, R. L., & Weber, E. (2002). Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. JAMA, 287(21), 2847–2850.

    Article  Google Scholar 

  • Castillo, C., Donato, D., & Gionis, A. (2007) Estimating number of citations using author reputation. In: String processing and information retrieval (pp. 107–117). Berlin: Springer

    Chapter  Google Scholar 

  • Cummings, J. N., & Kiesler, S. (2008). Who collaborates successfully? Prior experience reduces collaboration barriers in distributed interdisciplinary research. In: Proceedings of the 2008 ACM conference on computer supported cooperative work (pp. 437–446). ACM

  • Didegah, F., & Thelwall, M. (2013). Determinants of research citation impact in nanoscience and nanotechnology. Journal of the American Society for Information Science and Technology, 64(5), 1055–1064.

    Article  Google Scholar 

  • Dorn, C., & Dustdar, S. (2010). Composing near-optimal expert teams: A trade-off between skills and connectivity. On the Move to Meaningful Internet Systems: OTM, 2010, 472–489.

    Google Scholar 

  • Egghe, L. (2006). An improvement of the h-index: The g-index. ISSI Newsletter, 2(1), 8–9.

    MathSciNet  Google Scholar 

  • Eslami, H., Ebadi, A., & Schiffauerova, A. (2013). Effect of collaboration network structure on knowledge creation and technological performance: The case of biotechnology in canada. Scientometrics, 97(1), 99–119.

    Article  Google Scholar 

  • Fazel-Zarandi, M., & Fox, M. S. (2013). Inferring and validating skills and competencies over time. Applied Ontology, 8(3), 131–177.

    Google Scholar 

  • Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215–239.

    Article  MathSciNet  Google Scholar 

  • Fu, L. D., & Aliferis, C. F. (2010). Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature. Scientometrics, 85(1), 257–270.

    Article  Google Scholar 

  • Gajewar, A., & Sarma, A. D. (2012) Multi-skill collaborative teams based on densest subgraphs. In: SDM (pp. 165–176). SIAM.

  • Jirotka, M., Lee, C. P., & Olson, G. M. (2013). Supporting scientific collaboration: Methods, tools and concepts. Computer Supported Cooperative Work (CSCW), 22(4–6), 667–715.

    Article  Google Scholar 

  • Lappas, T., Liu, K., Terzi, E. (2009). Finding a team of experts in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 467–476). ACM.

  • Li, C. T., Shan, M. K., & Lin, S. D. (2015). On team formation with expertise query in collaborative social networks. Knowledge and Information Systems, 42(2), 441–463.

    Article  Google Scholar 

  • Liang, T. P., Liu, C. C., Lin, T. M., & Lin, B. (2007). Effect of team diversity on software project performance. Industrial Management & Data Systems, 107(5), 636–653.

    Article  Google Scholar 

  • Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.

    Article  Google Scholar 

  • Olson, G. M., Zimmerman, A., & Bos, N. (2008). Scientific collaboration on the Internet. Cambridge, MA: The MIT Press.

    Book  Google Scholar 

  • PubMed: MS Windows NT kernel description (2005).

  • Schleyer, T., Butler, B. S., Song, M., & Spallek, H. (2012). Conceptualizing and advancing research networking systems. ACM Transactions on Computer-Human Interaction (TOCHI), 19(1), 2.

    Article  Google Scholar 

  • Schmidt, K., & Bannon, L. (2013). Constructing cscw: The first quarter century. Computer Supported Cooperative Work (CSCW), 22(4–6), 345–372.

    Article  Google Scholar 

  • Sharma, A., Srivastava, J., &Chandra, A. (2014). Predicting multi-actor collaborations using hypergraphs. arXiv preprint arXiv:1401.6404.

  • Skilton, P. (2008). Does the human capital of teams of natural science authors predict citation frequency? Scientometrics, 78(3), 525–542.

    Article  Google Scholar 

  • Sonnenwald, D. H. (2007). Scientific collaboration: A synthesis of challenges and strategies. Annual Review of Information Science and Technology, 41, 643–681.

    Article  Google Scholar 

  • Stokols, D., Misra, S., Moser, R. P., Hall, K. L., & Taylor, B. K. (2008). The ecology of team science: Understanding contextual influences on transdisciplinary collaboration. American Journal of Preventive Medicine, 35(2), S96–S115.

    Article  Google Scholar 

  • Tan, S., Bu, J., Chen, C., & He, X. (2011). Using rich social media information for music recommendation via hypergraph model. In: Social media modeling and computing (pp. 213–237). New York: Springer.

  • Torres-Carrasquillo, P. A., Reynolds, D. A., & Deller  Jr, J. (2002). Language identification using gaussian mixture model tokenization. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), Vol. 1, pp. I–757.

  • Wang, M., Yu, G., An, S., & Yu, D. (2012). Discovery of factors influencing citation impact based on a soft fuzzy rough set model. Scientometrics, 93(3), 635–644.

    Article  Google Scholar 

  • Whitfield, J. (2008). Collaboration: Group theory. Nature, 455, 720–723.

    Article  MATH  Google Scholar 

  • Wi, H., Oh, S., Mun, J., & Jung, M. (2009). A team formation model based on knowledge and collaboration. Expert Systems with Applications, 36(5), 9121–9134.

    Article  Google Scholar 

  • Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. In: Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries (pp. 51–60). ACM.

  • Yu, T., Yu, G., Li, P. Y., & Wang, L. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics, 101(2), 1233–1252.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Kamran Zamanifar.

Additional information

An erratum to this article is available at

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghasemian, F., Zamanifar, K., Ghasem-Aqaee, N. et al. Toward a better scientific collaboration success prediction model through the feature space expansion. Scientometrics 108, 777–801 (2016).

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: