Skip to main content

A bibliometric analysis of plagiarism and self-plagiarism through Déjà vu


Plagiarism is one of the most important current debates among scientific stakeholders. A separate but related issue is the use of authors’ own ideas in different papers (i.e., self-plagiarism). Opinions on this issue are mixed, and there is a lack of consensus. Our goal was to gain deeper insight into plagiarism and self-plagiarism through a citation analysis of documents involved in these situations. The Déjà vu database, which comprises around 80,000 duplicate records, was used to select 247 pairs of documents that had been examined by curators on a full text basis following a stringent protocol. We then used the Scopus database to perform a citation analysis of the selected documents. For each document pair, we used specific bibliometric indicators, such as the number of authors, full text similarity, journal impact factor, the Eigenfactor, and article influence. Our results confirm that cases of plagiarism are published in journals with lower visibility and thus tend to receive fewer citations. Moreover, full text similarity was significantly higher in cases of plagiarism than in cases of self-plagiarism. Among pairs of documents with shared authors, duplicates not citing the original document showed higher full text similarity than those citing the original document, and also showed greater overlap in the references cited in the two documents.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  • Bonnell, D. A., Hafner, J. H., Hersam, M. C., Kotov, N. A., Buriak, J. M., Hammond, P. T., et al. (2012). Recycling is not always good: The dangers of self-plagiarism. ACS Nano, 24, 1–4.

    Article  Google Scholar 

  • Chalmers, I. (2009). Intentional self-plagiarism. Lancet, 374, 1422.

    Article  Google Scholar 

  • Chrousos, G. P., Kalantaridou, S. N., Margioris, A. N., & Gravanis, A. (2012). The ‘self-plagiarism’ oxymoron: Can one steal from oneself? European Journal of Clinical Investigation, 42, 231–232.

    Article  Google Scholar 

  • COPE. (2013). Text recycling. Comments from the Forum 12 March 2013 Accessed 7 Sep 2013.

  • Errami, M., & Garner, H. R. (2008). A tale of two citations. Nature, 451, 397–399.

    Article  Google Scholar 

  • Errami, M., Hicks, J. M., Fisher, W., Trusty, D., Wren, J. D., Long, T. C., et al. (2007). Déjà vu-A study of duplicate citations in Medline. Bioinformatics, 24, 243–249.

    Article  Google Scholar 

  • Errami, M., Sun, Z., Long, T. C., George, A. C., & Garner, H. R. (2009). Déjà vu: A database of highly similar citations in the scientific literature. Nucleic Acids Research, 37, D921–D924.

    Article  Google Scholar 

  • Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS ONE, 4, e5738.

    Article  Google Scholar 

  • Fang, F. C., Steen, R. G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences of the United States of America, 109, 17028–17033.

    Article  Google Scholar 

  • García-Romero, A., Navarrete, J., Escudero, C., Fernández, J., & Chaichío, J. (2009). Measuring the influence of clinical trials citations on several bibliometric indicators. Scientometrics, 80, 747–760.

    Article  Google Scholar 

  • Garfield, E., & Welljams-Dorof, A. (1990). The impact of fraudulent research on the scientific literature: The Stephen E. Breuning case. JAMA, 263, 1424–1426.

    Article  Google Scholar 

  • Lariviere, V., & Gingras, Y. (2010). On the prevalence and scientific impact of duplicate publications in different scientific fields (1980–2007). Journal of Documentation, 66, 179–190.

    Article  Google Scholar 

  • Lewis, J., Ossowski, S., Hicks, J., Errami, M., & Garner, H. R. (2006). Text similarity: An alternative way to search MEDLINE. Bioinformatics, 22, 2298–2304.

    Article  Google Scholar 

  • Martinson, B. C., Anderson, M. S., & de Vries, R. (2005). Scientists behaving badly. Nature, 435, 737–738.

    Article  Google Scholar 

  • Neale, A. V., Dailey, R. K., & Abrams, J. (2010). Analysis of citations to biomedical articles affected by scientific misconduct. Science and Engineering Ethics, 16, 251–261.

    Article  Google Scholar 

  • Pfeifer, M. P., & Snodgrass, G. L. (1990). The continued use of retracted, invalid scientific literature. JAMA, 263, 1420–1423.

    Article  Google Scholar 

  • Reich, E. S. (2010). Self-plagiarism case prompts calls for agencies to tighten rules. Nature, 468, 745.

    Article  Google Scholar 

  • Samuelson, P. (1994). Copyright’s fair use doctrine and digital data. Communications of the ACM, 37, 21.

    Google Scholar 

  • (2009). Self-plagiarism: Unintentional, harmless, or fraud? Lancet, 374, 664.

  • Sun, Z., Errami, M., Long, T., Renard, C., Choradia, N., & Garner, H. (2010). Systematic characterizations of text similarity in full text biomedical publications. PLoS ONE, 5, e12704.

    Article  Google Scholar 

  • U. S. Department of Health and Human Services. (2005). Public Health Service policies on research misconduct. Final rule. Federal Register, 70, 28369–28400.

    Google Scholar 

Download references


We are grateful to Grant Lewison, Juan Gorraiz, Juan M. García Ruiz, and Javier Ruiz-Castillo for their comments on an earlier version of the manuscript, although any errors are our own and should not tarnish the reputations of these esteemed persons. We also wish to thank the two “anonymous” reviewers for their comments and suggestions.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Antonio García-Romero.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

García-Romero, A., Estrada-Lorenzo, J.M. A bibliometric analysis of plagiarism and self-plagiarism through Déjà vu. Scientometrics 101, 381–396 (2014).

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI:


  • Plagiarism
  • Duplicate publications
  • Déjà vu
  • Citation analysis