Academic publishers claim that they add value to scholarly communications by coordinating reviews and contributing and enhancing text during publication. These contributions come at a considerable cost: US academic libraries paid \(\$1.7\) billion for serial subscriptions in 2008 alone. Library budgets, in contrast, are flat and not able to keep pace with serial price inflation. We have investigated the publishers’ value proposition by conducting a comparative study of pre-print papers from two distinct science, technology, and medicine corpora and their final published counterparts. This comparison had two working assumptions: (1) If the publishers’ argument is valid, the text of a pre-print paper should vary measurably from its corresponding final published version, and (2) by applying standard similarity measures, we should be able to detect and quantify such differences. Our analysis revealed that the text contents of the scientific papers generally changed very little from their pre-print to final published versions. These findings contribute empirical indicators to discussions of the added value of commercial publishers and therefore should influence libraries’ economic decisions regarding access to scholarly publications.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Björk, B.C.: Have the mega-journals reached the limits to growth? PeerJ 3, e981 (2015)
Björk, B.C., Welling, P., Laakso, M., Majlender, P., Hedlund, T., Guðnason, G.: Open access to the scientific journal literature: situation 2009. PLoS ONE 5(6), e11,273 (2009)
Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66, 2215–2222 (2015)
Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull. Soc. Vaud. Sci. Nat. 37, 547–579 (1901)
Jamali, H.R., Nabavi, M.: Open access and sources of full-text articles in Google Scholar in different subject fields. Scientometrics 105(3), 1635–1651 (2015)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10(8), 707–710 (1966)
Mabe, M.: (Electronic) Journal publishing. In: The E-Resource Management Handbook. UK Serials Group (2006)
Office of Management and Budget (U.S.): Fiscal Year 2014 Analytical Perspectives: Budget of the U.S. Government. Office of Management and Budget (2013)
Pang-Ning, T., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston (2006)
Porter, M.F.: An algorithm for suffix stripping. Electron. Libr. Inf. Syst. 14(3), 130–137 (1980)
Sørensen, T.: A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biol. Skr. 5, 1–34 (1948)
University of California: Accountability Report 2015. http://accountability.universityofcalifornia.edu/2015/chapters/chapter-9.html
Ware, M., Wabe, M.: The STM report—an overview of scientific and scholarly journal publishing. International Association of Scientific, Technical and Medical Publishers (2015). http://www.stm-assoc.org/2015_02_20_STM_Report_2015.pdf
About this article
Cite this article
Klein, M., Broadwell, P., Farb, S.E. et al. Comparing published scientific journal articles to their pre-print versions. Int J Digit Libr 20, 335–350 (2019). https://doi.org/10.1007/s00799-018-0234-1
- Open access
- Scholarly publishing
- Text similarity