Skip to main content
Log in

Detecting and analyzing missing citations to published scientific entities

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Proper citation is of great importance in academic writing for it enables knowledge accumulation and maintains academic integrity. However, citing properly is not an easy task. For published scientific entities, the ever-growing academic publications and over-familiarity of terms easily lead to missing citations. To deal with this situation, we design a special method Citation Recommendation for Published Scientific Entity (CRPSE) based on the cooccurrences between published scientific entities and in-text citations in the same sentences from previous researchers. Experimental outcomes show the effectiveness of our method in recommending the source papers for published scientific entities. We further conduct a statistical analysis on missing citations among papers published in prestigious computer science conferences in 2020. In the 12,278 papers collected, 475 published scientific entities of computer science and mathematics are found to have missing citations. Many entities mentioned without citations are found to be well-accepted research results. On a median basis, the papers proposing these published scientific entities with missing citations were published 8 years ago, which can be considered the time frame for a published scientific entity to develop into a well-accepted concept. For published scientific entities, we appeal for accurate and full citation of their source papers as required by academic standards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://app.dimensions.ai/discover/publication.

  2. Data was obtained in November of 2021.

  3. https://www.tensorflow.org/.

  4. https://github.com/tensorflow.

  5. The examples in the figure are created for better illustration, not real examples from S2ORC.

  6. https://github.com/explosion/spaCy.

  7. https://www.kaggle.com/rtatman/english-word-frequency.

  8. https://api.semanticscholar.org/.

  9. Papers with parsing errors are excluded.

References

Download references

Acknowledgements

We would like to acknowledge the support of Yingmin Wang for improving the mathematical expressions. We are grateful to Li Lei, Xun Zhou, Lei Lin and Meizhen Zheng for their help in the data processing. We also appreciate two anonymous reviewers for their valuable comments. Special and heartfelt gratitude goes to the first author’s wife Fenmei Zhou, for her understanding and love. Her unwavering support and continuous encouragement enable this research to be possible.

Funding

This work is partly funded by the 13th Five-Year Plan project Artificial Intelligence and Language of State Language Commission of China (Grant No. WT135-38).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Shi.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J., Yu, Y., Song, J. et al. Detecting and analyzing missing citations to published scientific entities. Scientometrics 127, 2395–2412 (2022). https://doi.org/10.1007/s11192-022-04334-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04334-5

Keywords

Navigation