Do citations and readership identify seminal publications?

Abstract

This work presents a new approach for analysing the ability of existing research metrics to identify research which has strongly influenced future developments. More specifically, we focus on the ability of citation counts and Mendeley reader counts to distinguish between publications regarded as seminal and publications regarded as literature reviews by field experts. The main motivation behind our research is to gain a better understanding of whether and how well the existing research metrics relate to research quality. For this experiment we have created a new dataset which we call TrueImpactDataset and which contains two types of publications, seminal papers and literature reviews. Using the dataset, we conduct a set of experiments to study how citation and reader counts perform in distinguishing these publication types, following the intuition that causing a change in a field signifies research quality. Our research shows that citation counts work better than a random baseline (by a margin of 10%) in distinguishing important seminal research papers from literature reviews while Mendeley reader counts do not work better than the baseline.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

Notes

  1. 1.

    With some exceptions, notably systematic reviews, which are a key practice in evidence-based medicine.

  2. 2.

    http://trueimpactdataset.semantometrics.org/

  3. 3.

    As there are Copyright Exceptions for text and data mining in some countries, such as in the UK, we are happy to provide the PDF documents for these purposes to researchers residing in these jurisdictions upon request.

  4. 4.

    http://dev.mendeley.com

  5. 5.

    http://ipscience-help.thomsonreuters.com/wosWebServicesLite/WebServicesLiteOverviewGroup/Introduction.html

  6. 6.

    http://admin-apps.webofknowledge.com/JCR/help/h_sourcedata.htm#sourcedata

References

  1. Abramo, G., DAngelo, C. A., & Di Costa, F. (2010). Citations versus journal impact factor as proxy of quality: Could the latter ever be preferable? Scientometrics, 84(3), 821–833.

    Article  Google Scholar 

  2. Adler, R., Ewing, J., & Taylor, P. (2009). Citation statistics. Statistical Science, 24(1), 1.

    MathSciNet  Article  MATH  Google Scholar 

  3. Adler, N. J., & Harzing, A.-W. (2009). When knowledge wins: Transcending the sense and nonsense of academic rankings. Academy of Management Learning and Education, 8(1), 72–95.

    Article  Google Scholar 

  4. Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation, 3(12), 159–170. https://doi.org/10.3152/147154403781776645. ISSN 09582029.

    Article  Google Scholar 

  5. Althouse, B. M., West, J. D., Bergstrom, C. T., & Bergstrom, T. (2009). Differences in impact factor across fields and over time. Journal of the American Society for Information Science and Technology, 60(1), 27–34. https://doi.org/10.1002/asi.20936. ISSN 14923831.

    Article  Google Scholar 

  6. Antonakis, J., Bastardoz, N., Liu, Y., & Schriesheim, C. A. (2014). What makes articles highly cited? The Leadership Quarterly, 25(1), 152–179.

    Article  Google Scholar 

  7. Australian Research Council. (2015). Excellence in research for australia: Era 2015 evaluation handbook. Technical report.

  8. Bertin, M., Atanassova, I., Gingras, Y., & Larivière, V. (2016). The invariant distribution of references in scientific articles. Journal of the Association for Information Science and Technology, 67(1), 164–177.

    Article  Google Scholar 

  9. Bornmann, L. (2014). Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics. Journal of Informetrics, 8(4), 895–903.

    Article  Google Scholar 

  10. Bornmann, L. (2015). Usefulness of altmetrics for measuring the broader impact of research: A case study using data from plos and f1000prime. Aslib Journal of Information Management, 67(3), 305–319.

    Article  Google Scholar 

  11. Bornmann, L., & Daniel, H.-D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391–392.

    Article  Google Scholar 

  12. Bornmann, L., & Daniel, H.-D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.

    Article  Google Scholar 

  13. Bornmann, L., & Haunschild, R. (2015). Which people use which scientific papers? An evaluation of data from f1000 and mendeley. Journal of Informetrics, 9(3), 477–487.

    Article  Google Scholar 

  14. Bornmann, L., & Haunschild, R. (2017). Does evaluative scientometrics lose its main focus on scientific quality by the new orientation towards societal impact? Scientometrics, 110(2), 937–943.

    Article  Google Scholar 

  15. Bornmann, L., & Leydesdorff, L. (2015). Does quality and content matter for citedness? A comparison with para-textual factors and over time. Journal of Informetrics, 9(3), 419–429.

    Article  Google Scholar 

  16. Bornmann, L., Nast, I., & Daniel, H.-D. (2008). Do editors and referees look for signs of scientific misconduct when reviewing manuscripts? A quantitative content analysis of studies that examined review criteria and reasons for accepting and rejecting manuscripts for publication. Scientometrics, 77(3), 415–432.

    Article  Google Scholar 

  17. Butler, L. (2008). Using a balanced approach to bibliometrics: Quantitative performance measures in the australian research quality framework. Ethics in Science and Environmental Politics, 8(1), 83–92.

    Article  Google Scholar 

  18. D’Angelo, C. A., & Abramo, G. (2015). Publication rates in 192 research fields of the hard sciences. In Proceedings of the 15th ISSI conference (pp. 915–925).

  19. Francois, O. (2015). Arbitrariness of peer review: A bayesian analysis of the nips experiment. arXiv preprint arXiv:1507.06411.

  20. Garfield, E. (2003). The meaning of the impact factor. International Journal of Clinical and Health Psychology, 3(2), 363–369.

    Google Scholar 

  21. Harwood, N. (2009). An interview-based study of the functions of citations in academic writing across two disciplines. Journal of Pragmatics, 41(3), 497–518.

    Article  Google Scholar 

  22. Harzing, A-W. (2016). Microsoft Academic (Search): A Phoenix arisen from the ashes? p. 11.

  23. Harzing, A.-W., & Alakangas, S. (2016a). Google Scholar, Scopus and the Web of Science: A longitudinal and cross-disciplinary comparison. Scientometrics, 106(2), 787–804. https://doi.org/10.1007/s11192-015-1798-9.

    Article  Google Scholar 

  24. Harzing, A.-W., & Alakangas, S. (2016b). Google scholar, scopus and the web of science: A longitudinal and cross-disciplinary comparison. Scientometrics, 106(2), 787–804.

    Article  Google Scholar 

  25. Haustein, S., & Larivière, V. (2014). Mendeley as a source of readership by students and postdocs? Evaluating article usage by academic status. In Proceedings of the IATUL conferences.

  26. Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H., & Terliesner, J. (2014). Coverage and adoption of altmetrics sources in the bibliometric community. Scientometrics, 101(2), 1145–1163.

    Article  Google Scholar 

  27. Hu, Z., Chen, C., & Liu, Z. (2015). The recurrence of citations within a scientific article. In ISSI

  28. Kelly, J., Sadeghieh, T., & Adeli, K. (2014). Peer review in scientific publications: Benefits, critiques, and a survival guide. Electronic Journal of the International Federation of Clinical Chemistry and Laboratory Medicine, 25(3), 227.

    Google Scholar 

  29. Knoth, P., & Herrmannova, D. (2014). Towards semantometrics: A new semantic similarity based measure for assessing a research publication’s contribution. D-Lib Magazine, 20(11), 8.

    Google Scholar 

  30. Kreiman, G., & Maunsell, J. H. R. (2011). Nine criteria for a measure of scientific output. Frontiers in computational neuroscience, 5(48), 11.

    Google Scholar 

  31. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).

  32. MacRoberts, M. H., & MacRoberts, B. R. (2010b). Problems of citation analysis: A study of uncited and seldom-cited influences. Journal of the American Society for Information Science and Technology, 61(1), 1–12.

    Article  Google Scholar 

  33. Maflahi, N., & Thelwall, M. (2016). When are readership counts as useful as citation counts? Scopus versus mendeley for lis journals. Journal of the Association for Information Science and Technology, 67(1), 191–199. https://doi.org/10.1002/asi.23369.

    Article  Google Scholar 

  34. McVeigh, M. E., & Mann, S. J. (2009). The journal impact factor denominator: Defining citable (counted) items. Jama, 302(10), 1107–1109.

    Article  Google Scholar 

  35. Meho, L. I. (2007). The rise and rise of citation analysis. Physics World, 20(1), 32.

    Article  Google Scholar 

  36. Michael, H., MacRoberts, M. H., & MacRoberts, B. R. (2010a). Problems of citation analysis: A study of uncited and seldom-cited influences. Journal of the American Society for Information Science and Technology, 61(1), 1–12.

    Article  Google Scholar 

  37. Mohammadi, E., Thelwall, M., Haustein, S., & Larivière, V. (2015). Who reads research articles? An altmetrics analysis of mendeley user categories. Journal of the Association for Information Science and Technology, 66(9), 1832–1846.

    Article  Google Scholar 

  38. Mohammadi, E., Thelwall, M., Kousha, K., et al. (2016). Can mendeley bookmarks reflect readership? A survey of user motivations. JASIST, 67(5), 1198–1209.

    Google Scholar 

  39. Nedić, O., & Dekanski, A. (2016). Priority criteria in peer review of scientific articles. Scientometrics, 107(1), 15–26.

    Article  Google Scholar 

  40. Onodera, N., & Yoshikane, F. (2015). Factors affecting citation rates of research articles. Journal of the Association for Information Science and Technology, 66(4), 739–764.

    Article  Google Scholar 

  41. Patton, R. M., Stahl, C. G., & Wells, J. C. (2016). Measuring scientific impact beyond citation counts. D-Lib Magazine, 22(9/10), 5.

    Article  Google Scholar 

  42. Piwowar, H., & Priem, J. (2013). The power of altmetrics on a cv. Bulletin of the American Society for Information Science and Technology, 39(4), 10–13. https://doi.org/10.1002/bult.2013.1720390405.

    Article  Google Scholar 

  43. Pride, D., Knoth, P. (2017). Incidental or influential?—challenges in automatically detecting citation importance using publication full texts. In Theory and Practice of Digital Libraries (TPDL) 2017, Thessaloniki, Greece

  44. Priem, J., Piwowar, H. A., & Hemminger, Bradley M. (2012). Altmetrics in the wild: Using social media to explore scholarly impact. arXiv preprint arXiv:1203.4745.

  45. Priem, J. (2014). Altmetrics. In B. Cronin & C. R. Sugimoto (Eds.), Beyond bibliometrics: harnessing multidimensional indicators of scholarly impact, chapter 14 (pp. 263–288). Cambridge, MA: MIT Press.

    Google Scholar 

  46. REF. (2014a). Panel criteria and working methods. Technical Report January 2012, 2012.

  47. Research Excellence Framework. (2012). Panel criteria and working methods. Technical report.

  48. Research Excellence Framework. (2014b). Research excellence framework (REF) 2014 units of assessment. http://www.ref.ac.uk/panels/unitsofassessment/, Accessed: 2016 Nov 11.

  49. Ricker, M. (2017). Letter to the editor: About the quality and impact of scientific articles. Scientometrics, 111(3), 1851–1855.

    Article  Google Scholar 

  50. Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ British Medical Journal, 314(February), 498–502.

    Google Scholar 

  51. Sternberg, R. J., & Gordeeva, T. (1996). The anatomy of impact: What makes an article influential? Psychological Science, 7(2), 69–75.

    Article  Google Scholar 

  52. Teixeira da Silva, J. A., & Dobránszki, J. (2015). Problems with traditional science publishing and finding a wider niche for post-publication peer review. Accountability in Research, 22(1), 22–40.

    Article  Google Scholar 

  53. Tertiary Education Commission. (2013). Performance-based research fund: Quality evaluation guidelines 2012. Technical report.

  54. Teufel, S., Siddharthan, A., Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110). Association for Computational Linguistics.

  55. Thelwall, M., & Kousha, K. (2015a). Web indicators for research evaluation. part 1: Citations and links to academic articles from the web. El profesional de la información, 24(5), 587–606.

    Article  Google Scholar 

  56. Thelwall, M., & Kousha, K. (2015b). Web indicators for research evaluation. Part 2: Social media metrics. El Profesional de la Información, 24(5), 607–620.

    Article  Google Scholar 

  57. Thelwall, M., & Sud, P. (2016). Mendeley readership counts: An investigation of temporal and disciplinary differences. Journal of the Association for Information Science and Technology, 67(12), 3036–3050.

    Article  Google Scholar 

  58. Thomson R. Journal citation reports – journal source data. http://admin-apps.webofknowledge.com/JCR/help/h_sourcedata.htm#sourcedata. Version: 2012-05-22, Accessed: 2017 Jan 26.

  59. Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. In Workshops at the twenty-ninth AAAI conference on artificial intelligence.

  60. Van Richard, N., Maher, B., & Nuzzo, R. (2014). The top 100 papers. Nature, 514(7524), 550.

    Article  Google Scholar 

  61. Waltman, L. (2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10(2), 365–391.

    MathSciNet  Article  Google Scholar 

  62. Wan, X., & Liu, F. (2014). Are all literature citations equally important? Automatic citation strength estimation and its applications. Journal of the Association for Information Science and Technology, 65(9), 1929–1938.

    Article  Google Scholar 

  63. Whalen, R., Huang, Y., Sawant, A., Uzzi, B., & Contractor, N. (2015). Natural language processing, article content and bibliometrics: Predicting high impact science. ASCW’15 Workshop at Web Science, 2015, 6–8.

    Google Scholar 

  64. Wilsdon, J., Allen, L., Belfiore, E., Campbell, P., Curry, S., Hill, S., Jones, R., Kain, R., Kerridge, S., Thelwall, M., inkler, J., Viney, I., Wouters, P., Hill, J., & Johnson, B. (2015). The metric tide: Report of the independent review of the role of metrics in research assessment and management. ISBN 1902369273. https://doi.org/10.13140/RG.2.1.4929.1363.

  65. Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. In Proceedings of the 12th joint conference on digital libraries (pp. 51–60), Washington, DC, ACM. ISBN 9781450311540.

  66. Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Drahomira Herrmannova.

Additional information

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Appendix A: Experiment results

Appendix A: Experiment results

see Tables 13, 14, 15, 16, 17 and 18.

Table 13 Results of independent one-tailed t-test performed using citation and readership counts on all disciplines separately
Table 14 Classification results using citation counts as a feature, performed on all disciplines separately
Table 15 Classification results using reader counts as a feature, performed on all disciplines separately
Table 16 Results of independent one-tailed t-test performed using citation and readership counts on all publication years separately
Table 17 Classification results using citation counts as a feature, performed on all years separately
Table 18 Classification results using reader counts as a feature, performed on all years separately

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Herrmannova, D., Patton, R.M., Knoth, P. et al. Do citations and readership identify seminal publications?. Scientometrics 115, 239–262 (2018). https://doi.org/10.1007/s11192-018-2669-y

Download citation

Keywords

  • Information retrieval
  • Scholarly communication
  • Publication datasets
  • Data mining
  • Research evaluation
  • Bibliometrics
  • Altmetrics