Social impact assessment of scientist from mainstream news and weblogs

  • Mohan Timilsina
  • Waqas Khawaja
  • Brian Davis
  • Mike Taylor
  • Conor Hayes
Original Article


Research policy makers, funding agencies, universities and government organizations evaluate research output or impact based on the traditional citation count, peer review, h-index and journal impact factors. These impact measures also known as bibliometric indicators are limited to the academic community and cannot provide the broad perspective of research impact in public, government or business. The understanding that scholarly impact outside scientific and academic sphere has given rise to an area of scientometrics called alternative metrics or “altmetrics.” Moreover, researchers in this area incline to center around gauging scientific activity via social media, namely Twitter. However, these count-based measurements of impact are sensitive to gaming as they lack concrete references to the primary source. In this work, we expand a conventional citation graph to a heterogeneous graph of publications, scientists, venues, organizations based on more reliable social media sources such as mainstream news and weblogs. Our method is composed of two components: the first one is combining the bibliometric data with social media data like blogs and mainstream news. The second component investigates how standard graph-based metrics can be applied to a heterogeneous graph to predict the academic impact. Our result showed moderate correlations and positive associations between the computed graph-based metrics with academic impact and also reasonably predict the academic impact of researchers.


Altmetrics Heterogeneous Graph Impact h-index Scientist Prediction 



We would like to acknowledge Science Foundation of Ireland (SFI/12/RC/2289) and the targeted project Elsevier for funding this research. We extend our gratitude to John Lonican for creating a citation graph from SCOPUS database and Erik Aumayr for insightful thoughts and constructive criticism. We would like to appreciate Prof. Jonice Oliveira from the Federal University of Rio de Janeiro for creative feedback and support.


  1. Acuna DE, Allesina S, Kording KP (2012) Future impact: predicting scientific success. Nature 489(7415):201–202CrossRefGoogle Scholar
  2. Aguinis H, Suárez-González I, Lannelongue G, Joo H (2012) Scholarly impact revisited. Acad Manag Perspect 26(2):105–132CrossRefGoogle Scholar
  3. Bergstrom C (2007) Measuring the value and prestige of scholarly journals. Coll Res Libr News 68(5):314–316CrossRefGoogle Scholar
  4. Bergstrom CT, West JD, Wiseman MA (2008) The eigenfactor metrics. J Neurosci 28(45):11,433–11,434CrossRefGoogle Scholar
  5. Bonett DG, Wright TA (2000) Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika 65(1):23–28CrossRefzbMATHGoogle Scholar
  6. Brin S, Page L (2012) Reprint of: the anatomy of a large-scale hypertextual web search engine. Comput Netw 56(18):3825–3833CrossRefGoogle Scholar
  7. Colquhoun D, Plested A (2014) Scientists don't count: Why you should ignore altmetrics and other bibliometric nightmares.
  8. Cunningham H (2002) Gate, a general architecture for text engineering. Comput Humanit 36(2):223–254CrossRefGoogle Scholar
  9. Cunningham H, Maynard D, Tablan V (2000) JAPE: a Java annotation patterns engine (Second edn). Research Memorandum CS–00–10, Department of Computer Science, University of Sheffield. URL
  10. Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th anniversary meeting of the association for computational linguistics (ACL’02)Google Scholar
  11. Ding Y, Yan E, Frazho A, Caverlee J (2009) Pagerank for ranking authors in co-citation networks. J Am Soc Inf Sci Technol 60(11):2229–2243CrossRefGoogle Scholar
  12. Egghe L (2007) Dynamic h-index: the hirsch index in function of time. J Am Soc Inf Sci Technol 58(3):452–454CrossRefGoogle Scholar
  13. Elkany AEMCP (1997) An efficient domain-independent algorithm for detecting approximately duplicate database records. In: Proceedings of the ACM-SIGMOD workshop on research issues in knowledge discovery and data mining, vol 1, pp 997–1023Google Scholar
  14. Evans TS (2015) Ranking journals using altmetrics. In: Proceedings of the 15th international society of scientometrics and informetrics conference. Istanbul. arXiv:1507.00451
  15. Eysenbach G (2011) Can tweets predict citations? Metrics of social impact based on twitter and correlation with traditional metrics of scientific impact. J Med Internet Res 13(4):e123CrossRefGoogle Scholar
  16. Finkel JR, Grenager T, Manning C (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd annual meeting on association for computational linguistics, association for computational linguistics, pp 363–370Google Scholar
  17. Garfield E et al (1972) Citation analysis as a tool in journal evaluation. American Association for the Advancement of Science, WashingtonGoogle Scholar
  18. Gruzd A, Goertzen M (2013) Wired academia: Why social science scholars are using social media. In: 2013 46th Hawaii international conference on system sciences (HICSS). IEEE, pp 3332–3341Google Scholar
  19. Hammarfelt B, de Rijcke S, Rushforth AD (2016) Quantified academic selves: the gamification of research through social networking services. Inf Res 21(2):21–2Google Scholar
  20. Hoffmann CP, Lutz C, Meckel M (2014) Impact factor 2.0: applying social network analysis to scientific impact assessment. In: 2014 47th Hawaii international conference on system sciences (HICSS). IEEE, pp 1576–1585Google Scholar
  21. Jolliffe I (2002) Principal component analysis. Wiley, New YorkzbMATHGoogle Scholar
  22. Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43CrossRefzbMATHGoogle Scholar
  23. Kearns M, Ron D (1999) Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Comput 11(6):1427–1453CrossRefGoogle Scholar
  24. Khawaja W, Taylor M, Davis B (2015) On developing extraction rules for mining informal scientific references from altmetric data sources. In: International conference on applications of natural language to information systems. Springer International Publishing, pp 443–447Google Scholar
  25. Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632MathSciNetCrossRefzbMATHGoogle Scholar
  26. Kwok R (2013) Research impact: altmetrics make their mark. Nature 500(7463):491–493CrossRefGoogle Scholar
  27. Li N, Gillet D (2013) Identifying influential scholars in academic social media platforms. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, pp 608–614Google Scholar
  28. Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inf Process Manag 41(6):1462–1480CrossRefGoogle Scholar
  29. Mazloumian A (2012) Predicting scholars’ scientific impact. PloS ONE 7(11):e49246CrossRefGoogle Scholar
  30. McFedries P (2012) Measuring the impact of altmetrics [technically speaking]. IEEE Spectr 8(49):28CrossRefGoogle Scholar
  31. Milojević S (2013) Accuracy of simple, initials-based methods for author name disambiguation. J Informetr 7(4):767–773CrossRefGoogle Scholar
  32. Moed HF (2006) Citation analysis in research evaluation, vol 9. Springer, New YorkGoogle Scholar
  33. Mohammadi E, Thelwall M, Haustein S, Larivière V (2015) Who reads research articles? An altmetrics analysis of mendeley user categories. J Assoc Inf Sci Technol 66(9):1832–1846CrossRefGoogle Scholar
  34. Neylon C, Wu S (2009) Article-level metrics and the evolution of scientific impact. PLoS Biol 7(11):e1000242CrossRefGoogle Scholar
  35. O’Brien K (2016) Communicating orthodontic research via social media. Semin Orthod 22(2):111–115CrossRefGoogle Scholar
  36. Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web, Technical Report 1999–66. Stanford InfoLabGoogle Scholar
  37. Petersen AM, Penner O (2014) Inequality and cumulative advantage in science careers: a case study of high-impact journals. EPJ Data Sci 3(1):1CrossRefGoogle Scholar
  38. Priem J, Groth P, Taraborelli D (2012a) The altmetrics collection. PloS ONE 7(11):e48753. doi: 10.1371/journal.pone.0048753 CrossRefGoogle Scholar
  39. Priem J, Piwowar HA, Hemminger BM (2012b) Altmetrics in the wild: Using social media to explore scholarly impact. arXiv preprint arXiv:12034745
  40. Ringelhan S, Wollersheim J, Welpe IM (2015) I like, i cite? do facebook likes predict the impact of scientific work? PloS ONE 10(8):e0134389CrossRefGoogle Scholar
  41. Sarigöl E, Pfitzner R, Scholtes I, Garas A, Schweitzer F (2014) Predicting scientific success based on coauthorship networks. EPJ Data Sci 3(1):1–16CrossRefGoogle Scholar
  42. Sayyadi H, Getoor L (2009) Futurerank: ranking scientific articles by predicting their future pagerank. In: SDM, SIAM, pp 533–544Google Scholar
  43. Soto MV, Balls-Berry JE, Bishop SG, Aase LA, Timimi FK, Montori VM, Patten CA (2016) Use of web 2.0 social media platforms to promote community-engaged research dialogs: a preliminary program evaluation. JMIR Res Protocols 5:e183. doi: 10.2196/resprot.4808 CrossRefGoogle Scholar
  44. Steiger JH (1980) Tests for comparing elements of a correlation matrix. Psychol Bull 87(2):245MathSciNetCrossRefGoogle Scholar
  45. Support A (2015) How is the altmetric score calculated? Accessed 12-Feb-2016
  46. Taylor M (2013) The challenges of measuring social impact using altmetrics [internet]. Res Trends 33:11–15Google Scholar
  47. Thelwall M (2008) Bibliometrics to webometrics. J Inf Sci 34(4):605–621CrossRefGoogle Scholar
  48. Thelwall M, Haustein S, Larivière V, Sugimoto CR (2013) Do altmetrics work? twitter and ten other social web services. PloS ONE 8(5):e64841CrossRefGoogle Scholar
  49. Timilsina M, Davis B, Taylor M, Hayes C (2016) Towards predicting academic impact from mainstream news and weblogs: A heterogeneous graph based approach. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 1388–1389Google Scholar
  50. Waltman L, Costas R (2013) F1000 recommendations as a new data source for research evaluation: a comparison with citations. arxiv preprint. arXiv preprint arXiv:13033875
  51. Wang Y, Tong Y, Zeng M (2013) Ranking scientific articles by exploiting citations, authors, journals, and time information. In: Twenty-seventh AAAI conference on artificial intelligenceGoogle Scholar
  52. Zahedi Z, Costas R, Wouters P (2014) How well developed are altmetrics? A cross-disciplinary analysis of the presence of alternative metrics in scientific publications. Scientometrics 101(2):1491–1513CrossRefGoogle Scholar
  53. Zhou D, Orshanskiy SA, Zha H, Giles CL (2007) Co-ranking authors and documents in a heterogeneous network. In: Seventh IEEE international conference on data mining, ICDM 2007. IEEE, pp 739–744Google Scholar
  54. Zhu X, Turney P, Lemire D, Vellino A (2015) Measuring academic influence: not all citations are equal. J Assoc Inf Sci Technol 66(2):408–427CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Austria 2017

Authors and Affiliations

  1. 1.Insight Centre for Data AnalyticsNational University of Ireland GalwayGalwayIreland
  2. 2.Statistical Cybermetrics Research GroupUniversity of WolverhamptonWolverhamptonUK

Personalised recommendations