Abstract
Plagiarism is one of the most important current debates among scientific stakeholders. A separate but related issue is the use of authors’ own ideas in different papers (i.e., self-plagiarism). Opinions on this issue are mixed, and there is a lack of consensus. Our goal was to gain deeper insight into plagiarism and self-plagiarism through a citation analysis of documents involved in these situations. The Déjà vu database, which comprises around 80,000 duplicate records, was used to select 247 pairs of documents that had been examined by curators on a full text basis following a stringent protocol. We then used the Scopus database to perform a citation analysis of the selected documents. For each document pair, we used specific bibliometric indicators, such as the number of authors, full text similarity, journal impact factor, the Eigenfactor, and article influence. Our results confirm that cases of plagiarism are published in journals with lower visibility and thus tend to receive fewer citations. Moreover, full text similarity was significantly higher in cases of plagiarism than in cases of self-plagiarism. Among pairs of documents with shared authors, duplicates not citing the original document showed higher full text similarity than those citing the original document, and also showed greater overlap in the references cited in the two documents.
Similar content being viewed by others
References
Bonnell, D. A., Hafner, J. H., Hersam, M. C., Kotov, N. A., Buriak, J. M., Hammond, P. T., et al. (2012). Recycling is not always good: The dangers of self-plagiarism. ACS Nano, 24, 1–4.
Chalmers, I. (2009). Intentional self-plagiarism. Lancet, 374, 1422.
Chrousos, G. P., Kalantaridou, S. N., Margioris, A. N., & Gravanis, A. (2012). The ‘self-plagiarism’ oxymoron: Can one steal from oneself? European Journal of Clinical Investigation, 42, 231–232.
COPE. (2013). Text recycling. Comments from the Forum 12 March 2013 http://publicationethics.org/files/u661/Text%20recycling_notes%20from%20Forum%20meeting_final.pdf. Accessed 7 Sep 2013.
Errami, M., & Garner, H. R. (2008). A tale of two citations. Nature, 451, 397–399.
Errami, M., Hicks, J. M., Fisher, W., Trusty, D., Wren, J. D., Long, T. C., et al. (2007). Déjà vu-A study of duplicate citations in Medline. Bioinformatics, 24, 243–249.
Errami, M., Sun, Z., Long, T. C., George, A. C., & Garner, H. R. (2009). Déjà vu: A database of highly similar citations in the scientific literature. Nucleic Acids Research, 37, D921–D924.
Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS ONE, 4, e5738.
Fang, F. C., Steen, R. G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences of the United States of America, 109, 17028–17033.
García-Romero, A., Navarrete, J., Escudero, C., Fernández, J., & Chaichío, J. (2009). Measuring the influence of clinical trials citations on several bibliometric indicators. Scientometrics, 80, 747–760.
Garfield, E., & Welljams-Dorof, A. (1990). The impact of fraudulent research on the scientific literature: The Stephen E. Breuning case. JAMA, 263, 1424–1426.
Lariviere, V., & Gingras, Y. (2010). On the prevalence and scientific impact of duplicate publications in different scientific fields (1980–2007). Journal of Documentation, 66, 179–190.
Lewis, J., Ossowski, S., Hicks, J., Errami, M., & Garner, H. R. (2006). Text similarity: An alternative way to search MEDLINE. Bioinformatics, 22, 2298–2304.
Martinson, B. C., Anderson, M. S., & de Vries, R. (2005). Scientists behaving badly. Nature, 435, 737–738.
Neale, A. V., Dailey, R. K., & Abrams, J. (2010). Analysis of citations to biomedical articles affected by scientific misconduct. Science and Engineering Ethics, 16, 251–261.
Pfeifer, M. P., & Snodgrass, G. L. (1990). The continued use of retracted, invalid scientific literature. JAMA, 263, 1420–1423.
Reich, E. S. (2010). Self-plagiarism case prompts calls for agencies to tighten rules. Nature, 468, 745.
Samuelson, P. (1994). Copyright’s fair use doctrine and digital data. Communications of the ACM, 37, 21.
(2009). Self-plagiarism: Unintentional, harmless, or fraud? Lancet, 374, 664.
Sun, Z., Errami, M., Long, T., Renard, C., Choradia, N., & Garner, H. (2010). Systematic characterizations of text similarity in full text biomedical publications. PLoS ONE, 5, e12704.
U. S. Department of Health and Human Services. (2005). Public Health Service policies on research misconduct. Final rule. Federal Register, 70, 28369–28400.
Acknowledgments
We are grateful to Grant Lewison, Juan Gorraiz, Juan M. García Ruiz, and Javier Ruiz-Castillo for their comments on an earlier version of the manuscript, although any errors are our own and should not tarnish the reputations of these esteemed persons. We also wish to thank the two “anonymous” reviewers for their comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
García-Romero, A., Estrada-Lorenzo, J.M. A bibliometric analysis of plagiarism and self-plagiarism through Déjà vu. Scientometrics 101, 381–396 (2014). https://doi.org/10.1007/s11192-014-1387-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1387-3