Skip to main content
Log in

Semantic similarity-based credit attribution on citation paths: a method for allocating residual citation to and investigating depth of influence of scientific communications

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

This study proposes a method for assessing the impact of scientific communication on their citation paths beyond conventional direct citations. The proposed method considers the contribution of scientific communication to their nth generation citations as a basis for calculating residual citations. Residual citations that are lost due to citation practices termed “Obliteration by Incorporation” and the “Palimpsestic Syndrome” in consequent citations in the second, third or nth generations are reconstituted. The proposed method is based on the semantic similarity between the citation contexts of a publication and those of its nth generation citations in their n + 1th generation citations. The proposed method was demonstrated using a sample of biomedical publications with ten base articles and their five generations of citations. Like the cascading citation system, residual citations accruing to articles from their generations of citations decreased as the number of generations increased. However, residual citation weights accrued to publications at all generation levels were statistically different between the proposed residual citation and the cascading citation system. This method introduces a new frontier that assesses the depth of impact of a publication (beyond the conventional direct citation level).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated and/or analyzed during the current study are available in the Mendeley repository, through https://doi.org/10.17632/6fgjxkv28d.3.

References

  • An, J., Kim, N., Kan, M.-Y., Chandrasekaran, M. K., & Song, M. (2017). Exploring characteristics of highly cited authors according to citation location and content. Journal of the Association for Information Science and Technology, 68(8), 1975–1988.

    Article  Google Scholar 

  • Asubiaro, T. V. (2021). Exploiting semantic similarity between citation contexts for direct citation weighting and residual citation [Doctoral Thesis, The University of Western Ontario]. https://ir.lib.uwo.ca/etd/8008/

  • Asubiaro, T. V., & Ajiferuke, I. (2021). A proposed method for residual citation allocation based on citation contexts similarity. Researchsquare. https://doi.org/10.21203/rs.3.rs-1041491/v1

    Article  Google Scholar 

  • Athar, A., & Teufel, S. (2012). Detection of implicit citations for sentiment detection. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 18–26.

  • Boyack, K. W., van Eck, N. J., Colavizza, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73. https://doi.org/10.1016/j.joi.2017.11.005

    Article  Google Scholar 

  • Chen, Q., Peng, Y., & Lu, Z. (2019). BioSentVec: Creating sentence embeddings for biomedical texts. In The Seventh IEEE International Conference on Healthcare Informatics (p. 5). IEEE: Beijing, China. https://doi.org/10.1109/ICHI.2019.8904728

  • Cohan, A., Ammar, W., van Zuylen, M., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. Proceedings of the 2019 Conference of the North, 3586–3596. https://doi.org/10.18653/v1/N19-1361

  • Dervos, D. A., & Kalkanis, T. (2005). cc-IFF: A cascading citations impact factor framework for the automatic ranking of research publications. IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, 2005, 668–673. https://doi.org/10.1109/IDAACS.2005.283070

    Article  Google Scholar 

  • Ding, Y., Liu, X., Guo, C., & Cronin, B. (2013). The distribution of references across texts: Some implications for citation analysis. Journal of Informetrics, 7(3), 583–592. https://doi.org/10.1016/j.joi.2013.03.003

    Article  Google Scholar 

  • Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the Association for Information Science and Technology, 65(9), 1820–1833.

    Article  Google Scholar 

  • Doslu, M., & Bingol, H. O. (2016). Context sensitive article ranking with citation context analysis. Scientometrics, 108(2), 653–671. https://doi.org/10.1007/s11192-016-1982-6

    Article  Google Scholar 

  • Einstein, A. (1905). Zur Elektrodynamik bewegter Körper. Annalen Der Physik, 322(10), 891–921. https://doi.org/10.1002/andp.19053221004

  • Fragkiadaki, E., Evangelidis, G., Samaras, N., & Dervos, D. A. (2009). Cascading citations indexing framework algorithm implementation and testing. 2009 13th Panhellenic Conference on Informatics, 70–74. https://doi.org/10.1109/PCI.2009.30

  • Han, M., Zhang, X., Yuan, X., Jiang, J., Yun, W., & Gao, C. (2021). A survey on the techniques, applications, and performance of short text semantic similarity. Concurrency and Computation: Practice and Experience, 33(5), e5971. https://doi.org/10.1002/cpe.5971

    Article  Google Scholar 

  • Hassan, S.-U., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. Digital Libraries (JCDL), 2017 ACM/IEEE Joint Conference On, 1–8.

  • Herlach, G. (1976). Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article. Journal of the American Society for Information Science, 29(6), 308.

    Article  Google Scholar 

  • HernáNdez-Alvarez, M., & Gomez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(03), 327–349. https://doi.org/10.1017/S1351324915000388

    Article  Google Scholar 

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National academy of Sciences U S A, 102(46), 16569–16572. https://doi.org/10.1073/pnas.0507655102

  • Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics, 7(4), 887–896. https://doi.org/10.1016/j.joi.2013.08.005

    Article  Google Scholar 

  • Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8(1), 197–211. https://doi.org/10.1016/j.joi.2013.12.001

    Article  Google Scholar 

  • Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. Proceedings of Recent Advances in Natural Language Processing, 402–407.

  • Lithgow-Serrano, O., Gama-Castro, S., Ishida-Gutiérrez, C., Mejía-Almonte, C., Tierrafría, V. H., Martínez-Luna, S., Santos-Zavaleta, A., Velázquez-Ramírez, D., & Collado-Vides, J. (2019). Similarity corpus on microbial transcriptional regulation. Journal of Biomedical Semantics, 10(1), 8. https://doi.org/10.1186/s13326-019-0200-x

    Article  Google Scholar 

  • Maričić, S., Spaventi, J., Pavičić, L., & Pifat-Mrzljak, G. (1998). Citation context versus the frequency counts of citation histories. Journal of the American Society for Information Science, 49(6), 530–540. https://doi.org/10.1002/(SICI)1097-4571(19980501)49:6%3c530::AID-ASI5%3e3.0.CO;2-U

    Article  Google Scholar 

  • McCain, K. W. (2014). Assessing obliteration by incorporation in a full-text database: JSTOR, economics, and the concept of “bounded rationality.” Scientometrics, 101(2), 1445–1459. https://doi.org/10.1007/s11192-014-1237-3

    Article  Google Scholar 

  • McKeown, K., Daume, H., Chaturvedi, S., Paparrizos, J., Thadani, K., Barrio, P., Biran, O., Bothe, S., Collins, M., Fleischmann, K. R., Gravano, L., Jha, R., King, B., McInerney, K., Moon, T., Neelakantan, A., O’Seaghdha, D., Radev, D., Templeton, C., & Teufel, S. (2016). Predicting the impact of scientific concepts using full-text features. Journal of the Association for Information Science and Technology, 67(11), 2684–2696. https://doi.org/10.1002/asi.23612

    Article  Google Scholar 

  • Meng, R., Lu, W., Chui, Y., & Shuguang, H. (2017). Automatic classification of citation function by new linguistic features. IConference 2017 Proceedings, 826–830. https://doi.org/10.9776/17349

  • Merton, R. K. (1965). On the shoulders of giants a shandean postscript-free press. The Free Press.

    Google Scholar 

  • Merton, R. K. (1988). The matthew effect in science, II: Cumulative advantage and the symbolism of intellectual property. Isis, 79(4), 606–623. https://doi.org/10.1086/354848

    Article  Google Scholar 

  • Pride, D., & Knoth, P. (2017). Incidental or influential?—A decade of using text-mining for citation function classification. 16th International Society of Scientometrics and Informetrics Conference, Wuhan, China.

  • Ritchie, A., Robertson, S., & Teufel, S. (2008). Comparing citation contexts for information retrieval. Proceeding of the 17th ACM Conference on Information and Knowledge Mining - CIKM ’08, 213. https://doi.org/10.1145/1458082.1458113

  • Singha Roy, S., Mercer, R. E., & Urra, F. (2020). Investigating citation linkage as a sentence similarity measurement task using deep learning. In C. Goutte & X. Zhu (Eds.), Advances in artificial intelligence (pp. 483–495). Springer International Publishing. https://doi.org/10.1007/978-3-030-47358-7_50

    Chapter  Google Scholar 

  • Soğancıoğlu, G., Öztürk, H., & Özgür, A. (2017). BIOSSES: A semantic sentence similarity estimation system for the biomedical domain. Bioinformatics, 33(14), i49–i58. https://doi.org/10.1093/bioinformatics/btx238

    Article  Google Scholar 

  • Stremersch, S., Camacho, N., Vanneste, S., & Verniers, I. (2015). Unraveling scientific impact: Citation types in marketing journals. International Journal of Research in Marketing, 32(1), 64–77. https://doi.org/10.1016/j.ijresmar.2014.09.004

    Article  Google Scholar 

  • Strotmann, A., & Zhao, D. (2014). Uncertainty of author citation rankings: Lessons from in-text citation weighing schemes. Proceedings of the Association for Information Science and Technology, 51(1), 1–4.

    Article  Google Scholar 

  • Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing - EMNLP ’06, 103. https://doi.org/10.3115/1610075.1610091

  • Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S.-U., & Haddawy, P. (2020). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896. https://doi.org/10.1109/TKDE.2019.2913376

    Article  Google Scholar 

  • Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. AAAI Workshops, 21–26.

  • Voos, H., & Dagaev, K. (1976). Are all citations equal? Or, did we op. cit. your idem? The Journal of Academic Librarianship, 1(6), 19–21.

    Google Scholar 

  • Wan, X., & Liu, F. (2014). Are all literature citations equally important? Automatic citation strength estimation and its applications. Journal of the Association for Information Science and Technology, 65(9), 1929–1938. https://doi.org/10.1002/asi.23083

    Article  Google Scholar 

  • Wang, Y., Afzal, N., Fu, S., Wang, L., Shen, F., Rastegar-Mojarad, M., & Liu, H. (2018). MedSTS: A resource for clinical semantic textual similarity. Language Resources and Evaluation. https://doi.org/10.1007/s10579-018-9431-1

    Article  Google Scholar 

  • Yang, X., He, X., Zhang, H., Ma, Y., Bian, J., & Wu, Y. (2020). Measurement of semantic textual similarity in clinical texts: Comparison of transformer-based models. JMIR Medical Informatics, 8(11), e19735. https://doi.org/10.2196/19735

    Article  Google Scholar 

  • Zhao, D., & Strotmann, A. (2014). In-text author citation analysis: Feasibility, benefits, and limitations. Journal of the Association for Information Science and Technology, 65(11), 2348–2358. https://doi.org/10.1002/asi.23107

    Article  Google Scholar 

  • Zhao, D., & Strotmann, A. (2016). Dimensions and uncertainties of author citation rankings: Lessons learned from frequency-weighted in-text citation counting. Journal of the Association for Information Science and Technology, 67(3), 671–682. https://doi.org/10.1002/asi.23418

    Article  Google Scholar 

  • Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427. https://doi.org/10.1002/asi.23179

    Article  Google Scholar 

Download references

Acknowledgements

This study is a modified version of a pre-print (Asubiaro & Ajiferuke, 2021) that was deposited in Research Square. This journal publication is part of the first author’s doctoral thesis and contains texts that have been copied verbatim from the thesis. The contributions of the first author's doctoral thesis committee members- Professor Robert Mercer, Computer Science Department, University of Western Ontario, London, Canada and Professor Victoria Rubin, Library and Information Science Program, University of Western Ontario, London, Canada are acknowledged.

Funding

The first author received the Western Graduate Research Scholarships from September 2016 and September 2020 and Ontario Graduate Scholarships and the Queen Elizabeth II Graduate Scholarships in Science and Technology (OGS/QEII-GSST), 2019 summer term to 2020 winter term.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Toluwase Victor Asubiaro.

Ethics declarations

Conflict of interest

The authors reported no potential competing interests.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asubiaro, T.V., Ajiferuke, I. Semantic similarity-based credit attribution on citation paths: a method for allocating residual citation to and investigating depth of influence of scientific communications. Scientometrics 127, 6257–6277 (2022). https://doi.org/10.1007/s11192-022-04522-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04522-3

Keywords

Navigation