Dead Science: Most Resources Linked in Biomedical Articles Disappear in Eight Years
- 2k Downloads
Scientific progress critically depends on disseminating analytic pipelines and datasets that make results reproducible and replicable. Increasingly, researchers make resources available for wider reuse and embed links to them in their published manuscripts. Previous research has shown that these resources become unavailable over time but the extent and causes of this problem in open access publications has not been explored well. By using 1.9 million articles from PubMed Open Access, we estimate that half of all resources become unavailable after 8 years. We find that the number of times a resource has been used, the international (int) and organization (org) domain suffixes, and the number of affiliations are positively related to resources being available. In contrast, we found that the length of the URL, Indian (in), European Union (eu), and Chinese (cn) domain suffixes, and abstract length are negatively related to resources being available. Our results contribute to our understanding of resource sharing in science and provide some guidance to solve resource decay.
Tong Zeng was funded by the China Scholarship Council #201706190067. Daniel E. Acuna was funded by the National Science Foundation awards #1646763 and #1800956.
- 1.Koehler, W., et al.: A longitudinal study of web pages continued: a consideration of document persistence. Inf. Res. 9(2) (2004)Google Scholar
- 7.Mangul, S., et al.: A comprehensive analysis of the usability and archival stability of omics computational tools and resources. bioRxiv, p. 452532 (2018)Google Scholar
- 9.Bonàs-Guarch, S., et al.: Re-analysis of public genetic data reveals a rare x-chromosomal variant associated with type 2 diabetes. Nature Commun. 9 (2018)Google Scholar
- 10.National Institutes of Health: Final NIH statement on sharing research data (2003). https://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html. Accessed 5 Dec 2018
- 11.National Science Foundation: NSF data sharing policy (2017). https://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/aag_6.jsp#VID4. Accessed 5 Dec 2018
- 13.Milham, M.P., et al.: Assessment of the impact of shared brain imaging data on the scientific literature. Nature commun. 9 (2018)Google Scholar
- 14.McCown, F., Chan, S., Nelson, M.L., Bollen, J.: The availability and persistence of web references in d-lib magazine. arXiv preprint cs/0511077 (2005)Google Scholar
- 16.Zittrain, J., Albert, K., Lessig, L.: Perma: scoping and addressing the problem of link and reference rot in legal citations. Legal Inf. Manag. 14(2), 88–99 (2014)Google Scholar
- 17.Gourley, D., Totty, B., Sayer, M., Aggarwal, A., Reddy, S.: HTTP: The Definitive Guide. O’Reilly Media Inc. (2002)Google Scholar
- 21.Zhou, K., Grover, C., Klein, M., Tobin, R.: No more 404s: predicting referenced link rot in scholarly articles for pro-active archiving. In: Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries - JCDL 2015, pp. 233–236. ACM Press (2015)Google Scholar
- 22.Internet Archive: Wayback machine. https://archive.org/web/. Accessed 5 Dec 2018
- 24.Eysenbach, G.: Preserving the scholarly record with webcite(r): an archiving system for long-term digital preservation of cited webpages. In: Proceedings ELPUB 2008 Conference on Electronic Publishing, pp. 378–389, Toronto, Canada (2008). www.webcitation.org