An examination of research data sharing and re-use: implications for data citation practice

Abstract

This study examines characteristics of data sharing and data re-use in Genetics and Heredity, where data citation is most common. This study applies an exploratory method because data citation is a relatively new area. The Data Citation Index (DCI) on the Web of Science was selected because DCI provides a single access point to over 500 data repositories worldwide and to over two million data studies and datasets across multiple disciplines and monitors quality research data through a peer review process. We explore data citations for Genetics and Heredity, as a case study by examining formal citations recorded in the DCI and informally by sampling a selection of papers for implicit data citations within publications. Citer-based analysis is conducted in order to remedy self-citation in the data citation phenomena. We explore 148 sampled citing articles in order to identify factors that influence data sharing and data re-use, including references, main text, supplementary data/information, acknowledgments, funding information, author information, and web/author resources. This study is unique in that it relies on a citer-based analysis approach and by analyzing peer-reviewed and published data, data repositories, and citing articles of highly productive authors where data sharing is most prevalent. This research is intended to provide a methodological and practical contribution to the study of data citation.

This is a preview of subscription content, access via your institution.

Fig. 1

References

  1. Ajiferuke, I., Lu, K., & Wolfram, D. (2010). A comparison of citer and citation-based measure outcomes for multiple disciplines. Journal of the American Society for Information Science and Technology, 61(10), 2086–2096.

    Article  Google Scholar 

  2. Altman, M. (2012). Data citation in the Dataverse Network. For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (pp. 99–106). Washington, DC: National Academies Press.

    Google Scholar 

  3. Bourne, P. (2012). Towards data attribution and citation in the life sciences. In P. F. Uhlir (Ed.), For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (p. 2012). Washington, DC: National Academies Press.

    Google Scholar 

  4. Buneman, P. (2006). How to cite curated databases and how to make them citable. In The 18th international conference on scientific and statistical database management (pp. 195–203). Los Alamitos: IEEE Computer Society. Retrieved 26 Jan 2016. http://homepages.inf.ed.ac.uk/opb/homepagefiles/harmarnew.pdf.

  5. Callaghan, S. (2012). Data citation in the earth and physical sciences. In P. F. Uhlir (Ed.), For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (pp. 49–54). Washington, DC: National Academies Press.

  6. Chao, T. (2015). Mapping methods metadata for research data. International Journal of Digital Curation, 10(1), 82–94.

    MathSciNet  Article  Google Scholar 

  7. Cronin, B. (2001). Hyperauthorship: A postmodern perversion or evidence of a structural shift in scholarly communication practices? Jounal of the American Society for Information Science and Technology, 52(7), 558–569.

    Article  Google Scholar 

  8. Curty, R. G. (2015). Beyond “data thrifting”: An investigation of factors influencing research data. Syracuse, NY: Syracuse University.

    Google Scholar 

  9. Data Citation Synthesis Working Group. (2014, February). Joint declaration of data citation principles-final. www.force11.org/datacitation.

  10. DataCite Metadata Working Group. (2015). DataCite metadata schema for the publication and citation of research data. doi:10.5438/0010.

  11. Fear, K. (2013). Measuring and anticipating the impact of data reuse. Ann Arbor: University of Michigan.

    Google Scholar 

  12. Green, T. (2009). We need publishing standards for datasets and data tables. OECD Publishing. doi:10.1787/787355886123.

  13. Helbig, K., Hausstein, B., & Toepfer, R. (2015). Supporting data citation: Experiences and best practices of a DOI allocation agency for social sciences. Journal of Librarianship and Scholarly Communication, 3(2), eP1220. doi:10.7710/2162-3309.1220.

    Article  Google Scholar 

  14. Jones, B. F. (2009). The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder? The Review of Economics Studies, 76(1), 283–317.

    Article  MATH  Google Scholar 

  15. Kim, Y. (2013). Institutional and individual influences on scientists’ data sharing behaviors. Syracuse, NY: Syracuse University.

    Google Scholar 

  16. Kurtz, M. J. (2012). Linking, finding and citing data in astronomy. In P. F. Uhlir (Ed.), For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (pp. 161–166). Washington, DC: National Academies Press.

    Google Scholar 

  17. LaBonte, K. B. (2015). Data citation rates: GIS data in the marine sciences and publisher citation requirements. International Association of Aquatic and Marine Science Libraries and Information Centers (IAMSLIC). doi:http://hdl.handle.net/1912/7402.

  18. Larivière, V. G. (2015). Team size matters: Collaboration and scientific impact since 1900. Journal of the Association for Information Science and Technology, 66(7), 1323–1332.

    Article  Google Scholar 

  19. Lawrence, B., Jones, C., Mattews, B., Pepler, S., & Callaghan, S. (2011). Citation and peer review of data: Moving towards formal data publication. International Journal of Digital Curation, 6(2), 4–37.

    Article  Google Scholar 

  20. Lee, D. J. (2015). Research data curation practices in institutional repositories and data identifiers. Tallahassee: Florida State University.

    Google Scholar 

  21. Lu, K., Ajiferuke, I., & Wolfram, D. (2014). Extending citer analysis to journal impact evaluation. Scientometrics, 100(1), 245–260.

    Article  Google Scholar 

  22. Mayernik, M. S. (2012). Session summary: The RDAP (research data alliance and preservation) 12 data citation panel. Bullentin of the American Society for Information Science and Technology, 38(5), 31.

    Article  Google Scholar 

  23. National Cancer Institute. (2006). Data sharing policy. http://ctep.cancer.gov/protocolDevelopment/docs/data_sharing_policy.pdf.

  24. National Institutes of Health. (2003). NIH data sharing policy and implementation guideline. http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm.

  25. Nature Publishing Group. (2013). Announcement: Launch of an online data journal. Nature, 502(7074), 142. doi:10.1038/502142a.

    Google Scholar 

  26. Parsons, M. A., Duerr, R., & Minster, J. B. (2010). Data citation and peer review. Transactions American Geophysical Union, 91(34), 297–298.

    Article  Google Scholar 

  27. Peters, I., Kraker, P., Lex, E., Gumpenberger, C., & Gorraiz, J. (2015). Research data explored: Citations versus altimetric. arXiv preprint. doi:arXiv:1501.03342.

  28. Peters, I., Kraker, P., Lex, E., Gumpenberger, C., & Gorraiz, J. (2016). Research data explored: An extended analysis of citations and altmetrics. Scientometrics, 107(2), 723–744.

    Article  Google Scholar 

  29. Piwowar, H. A., Day, R., & Fridsma, D. (2007). Sharing detailed research data is associated with increased citation rate. PLoS ONE. doi:10.1371/journal.pone.0000308.

    Google Scholar 

  30. Piwowar, H. A., & Vision, T. (2013). Data reuse and the open data citation advantage. PeerJ, 1, E175.

    Article  Google Scholar 

  31. Reilly, S., Schallier, W., Schrimpf, S., Smit, E., Wikinson, M., & European Commission. (2011). Report on integration of data and publications.

  32. Robinson-García, N., Jiménez-Contreras, E., & Torres-Salinas, D. (2015). Analyzing data citation practices using the data citation index. Journal of the Association for Information Science and Technology. doi:10.1002/asi.23529.

    Google Scholar 

  33. Sperberg-McQueen, M. (2012). Data citation in the humanities: What’s the problem? In P. F. Uhlir (Ed.), For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (pp. 59–64). Washington: National Academies Press.

    Google Scholar 

  34. Starr, J., & Gastl, A. (2011). IsCitedBy: A metadata scheme for DataCite. D-Lib Magazine. doi:10.1045/january2011-starrto.

    Google Scholar 

  35. Stephan, P. (2010). The economics of science. In B. Hall, & N. Rosenberg (Eds.), Handbooks in economics: Economics of innovation (Vol. 1). Amsterdam: Elsevier.

  36. Swoger, B. (2012, December). Thomson Reuters data citation index. Library Journal. http://wokinfo.com/media/pdf/dci-libjrnl-review.pdf.

  37. Task Group on Data Citation Standards Practices. (2013). Out of cite, out of mind: The current state of practice, policy, and technology for the citation of data. Data Science Journal, 12, CIDCR1–CIDCR75.

    Google Scholar 

  38. Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., et al. (2011). Data sharing by scientists: Practices and perceptions. PLoS ONE, 6(6), e21101.

    Article  Google Scholar 

  39. Tenopir, C., Dalton, E., Allard, S., Frame, M., Pjesivac, I., Birch, B., et al. (2015). Changes in data sharing and data reuse practices and perceptions among scientists worldwide. PLoS ONE. doi:10.1371/journal.pone.0134826.

    Google Scholar 

  40. Torres-Salinas, D., Jiménez-Contreras, E., & Robinson-García, N. (2014). How many citations are there in the Data Citation Index? arXiv preprint. arXiv:1409.0753.

  41. Uhlir, P. F. (2012). Developing data attribution and citation practices and standards: Summary of an international workshop. Washington, DC: The National Academic Press.

    Google Scholar 

  42. Vardigan, M. (2012). Data citation for the social sciences. In P. F. Uhlir (Ed.), For attribution: Developing scientific data attribution and data citation practices and standards—Summary of an international workshop (pp. 55–58). Washington, DC: National Academies Press.

    Google Scholar 

  43. Wren, J. D., Kozak, K. Z., Johnson, K. R., Deakyne, S. J., Schilling, L. M., & Dellavalle, R. P. (2007). The write position. EMBO Reports, 8(11), 988–991.

    Article  Google Scholar 

  44. Yoon, A. (2015). Data reuse and users’ trust judgments: Toward trusted data curation. Chapel Hill, NC: University of North Carolina.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hyoungjoo Park.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Park, H., Wolfram, D. An examination of research data sharing and re-use: implications for data citation practice. Scientometrics 111, 443–461 (2017). https://doi.org/10.1007/s11192-017-2240-2

Download citation

Keywords

  • Citation analysis
  • Data citation
  • Data sharing
  • Data re-use
  • Citer-based analysis
  • Research data

MSC

  • 62-07 data analysis

JEL

  • C02 mathematical methods