Not by metadata alone: the use of diverse forms of knowledge to locate data for reuse

REGULAR PAPER

Abstract

An important set of challenges for eScience initiatives and digital libraries concern the need to provide scientists with the ability to access data from multiple sources. This paper argues that an analysis of scientists‘ reuse of data prior to the advent of eScience can illuminate the requirements and design of digital libraries and cyberinfrastructure. As part of a larger study on data sharing and reuse, I investigated the processes by which ecologists locate data that were initially collected by others. Ecological data are unusually complex and present daunting problems of interpretation and analysis that must be considered in the design of cyberinfrastructure. The ecologists that I interviewed found ways to overcome many of these difficulties. One part of my results shows that ecologists use formal and informal knowledge that they have gained through disciplinary training and through their own data-gathering experiences to help them overcome hurdles related to finding, acquiring, and validating data collected by others. A second part of my findings reveals that ecologists rely on formal notions of scientific practice that emphasize objectivity to justify the methods they use to collect data for reuse. I discuss the implications of these findings for digital libraries and eScience initiatives.

Keywords

Data reuse Data sharing Ecology 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andelman S., Bowles C., Willig M. and Waide R. (2004). Understanding environmental complexity through a distributed knowledge network. BioScience 54: 240–246 CrossRefGoogle Scholar
  2. 2.
    Atkins, D., Droegemeier, K., Feldman, S., Garcia-Molina, H., Messerschmitt, D., Messina, P., Ostriker, J., Wright, H.: Revolutionizing Science and Engineering through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Panel on Cyberinfrastructure. National Science Foundation, Washington, DC (2003)Google Scholar
  3. 3.
    Baskin Y. (1997). Center seeks synthesis to make ecology more useful. Science 275(5298): 310–311 CrossRefGoogle Scholar
  4. 4.
    Berman, F., Brady, H.: Final Report: NSF SBE-CISE Workshop on Cyberinfrastructure and the Social Sciences (2005) http://vis.sdsc.edu/sbe/reports/SBE-CISE-FINAL.pdf [last visited November 2006]Google Scholar
  5. 5.
    Birnholtz, J.P., Bietz, M.J.: Data at work: supporting sharing in science and engineering. In: Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work. pp. 339–348 ACM Press, New York (2003)Google Scholar
  6. 6.
    Bowker G.C. (2000). Biodiversity datadiversity. Soc. Stud. Sci. 30(5): 643–683 CrossRefGoogle Scholar
  7. 7.
    Bowser, C.J.: Historic data sets: lessons from the past, lessons for the future. In: Michener, W.K. (ed.) Research Data Management in the Ecological Sciences. pp. 155–179. University of South Carolina Press, Columbia (1986)Google Scholar
  8. 8.
    Brown C. (2003). The changing face of scientific discourse: analysis of genomic and proteomic database usage and acceptance. J. Am. Soc. Inform. Sci. Technol. 54(10): 926–938 CrossRefGoogle Scholar
  9. 9.
    Buetow K.H. (2005). Cyberinfrastructure: empowering a “Third Way” in biomedical research. Science 308(5723): 821–824 CrossRefGoogle Scholar
  10. 10.
    Chen, C., Hernon, P. (eds.): Numeric databases. Ablex, Norwood, NJ (1984)Google Scholar
  11. 11.
    Clarke, A.E., Fujimura, J.H.: What tools? Which jobs? Why right? In: Clarke, A.E., Fujimura, J.H. (eds) The Right Tools for the Job: At Work in Twentieth-Century Life Sciences. pp. 3–44. Princeton University Press, Princeton (1992)Google Scholar
  12. 12.
    Collins, H.M.: [1985] Changing Order: Replication and Induction in Scientific Practice. University of Chicago Press, Chicago (1992)Google Scholar
  13. 13.
    Committee on the Future of Long-Term Ecological Data: Final Report of the Ecological Society of America Committee on the Future of Long-Term Ecological Data (FLED) (Vols. 1–2). Ecological Society of America, Washington, DC (1995)Google Scholar
  14. 14.
    Cooper, H., Hedges, L.V.: The Handbook of Research Synthesis. Russell Sage Foundation, New York (1994)Google Scholar
  15. 15.
    Crow, R.: The Case for Institutional Repositories: A SPARC Position Paper. The Scholarly Publishing & Academic Resources Coalition, Washington, DC (2002)Google Scholar
  16. 16.
    Ecological Visions Committee: Ecological Science and Sustainability for a Crowded Planet: 21st Century Vision and Action Plan for the Ecological Society of America. Ecological Society of America, Washington, DC (2004)Google Scholar
  17. 17.
    Ellisman M.H. (2005). Cyberinfrastructure and the future of collaborative work. Issues Sci. Technol. 22(1): 43–50 Google Scholar
  18. 18.
    Emmott, S.: Towards 2020 Science. Microsoft Research, Redmond, WA (2006)Google Scholar
  19. 19.
    Glasner P. (2002). Beyond the genome: reconstituting the new genetics. New Genet. Soc. 21(3): 267–277 CrossRefGoogle Scholar
  20. 20.
    Gray, A.S., Dodd, S.A.: The roles of libraries and information centers in providing access to numeric databases. In: Chen, C., Hernon, P. (eds.) Numeric Databases. pp. 247–262. Ablex, Norwood (1984)Google Scholar
  21. 21.
    Hey T. and Trefethen A.E. (2005). Cyberinfrastructure for e-Science. Science 308(5723): 817–821 CrossRefGoogle Scholar
  22. 22.
    Hilgartner S. (1995). Biomolecular databases: new communication regimes for biology?. Sci. Commun. 17(2): 240–263 CrossRefGoogle Scholar
  23. 23.
    Hine C. (2006). Databases as scientific instruments and their role in the ordering of scientific work. Soc. Stud. Sci. 36(2): 269–298 CrossRefGoogle Scholar
  24. 24.
    Kuklick H. and Kohler R.E. (1996). Science in the field: introduction. OSIRIS, Second series 11: 1–16 Google Scholar
  25. 25.
    Lynch, C.: Research libraries engage the digital world: A US-UK comparative examination of recent history and future prospects. Ariadne 46 http://www.ariadne.ac.uk/issue46/lynch/ [last visited November 2006] (2006)Google Scholar
  26. 26.
    Michener, W.K., Brunt, J.W. (eds.): Ecological Data: Design, Management, and Processing. Blackwell Science, Oxford, UK (2000)Google Scholar
  27. 27.
    Michener W.K., Brunt J.W., Helly J.J., Kirchner T.B. and Stafford S.G. (1997). Nongeospatial metadata for the ecological sciences. Ecol. Appl. 7(1): 330–342 CrossRefGoogle Scholar
  28. 28.
    National Research Council: Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data: Selected Case Studies. National Academy Press, Washington, DC (1995)Google Scholar
  29. 29.
    National Research Council: NEON: Addressing the Nation’s Environmental Challenges. National Academy Press, Washington, DC (2003)Google Scholar
  30. 30.
    National Science Board: Long-Lived Data Collections: Enabling Research and Education in the 21st Century. National Science Foundation, Arlington, VA (2005)Google Scholar
  31. 31.
    Pouchard L., Woolf A. and Bernholdt D. (2005). Data grid discovery and semantic web technologies for the earth sciences. Int. J. Dig. Lib. 5: 72–83 CrossRefGoogle Scholar
  32. 32.
    Roth W.-M. and Bowen G.M. (1999). Digitizing lizards: the topology of ‘vision’ in ecological fieldwork. Soc. Stud. Sci. 29(5): 719–764 CrossRefGoogle Scholar
  33. 33.
    Roth W.-M. and Bowen G.M. (2001). ‘Creative solutions’ and ‘fibbing results’: enculturation in field ecology. Soc. Stud. Sci. 31(4): 533–556 CrossRefGoogle Scholar
  34. 34.
    Roth W.-M. and Bowen G.M. (2001). Of disciplined minds and disciplined bodies: on becoming an ecologist. Qual. Sociol. 24(4): 459–481 CrossRefGoogle Scholar
  35. 35.
    Schiff, L.R., Van House, N.A., Butler, M.H.: Understanding complex information environments: a social analysis of watershed planning. In: Allen, R.B., Rasmussen, E.M. (eds.) Proceedings of the Second ACM International Conference on Digital Libraries. pp. 161-168. ACM Press, New York (1997)Google Scholar
  36. 36.
    Smith H.J. and Keil M. (2003). The reluctance to report bad news on troubled software projects: a theoretical model. Inform. Syst. J. 13: 69–95 CrossRefGoogle Scholar
  37. 37.
    Smith J.T. (1996). Meta-analysis: the librarian as a member of an interdisciplinary research team. Library Trends 45(2): 265–279 Google Scholar
  38. 38.
    Star S.L. and Ruhleder K. (1996). The ecology of infrastructure: problems in the implementation of large-scale information systems. Inform. Syst. Res. 7: 111–134 CrossRefGoogle Scholar
  39. 39.
    Weeber M., Kors J.A. and Mons B. (2005). Online tools to support literature-based discovery in the life sciences. Briefings Biomed. Inform. 6(3): 277–286 Google Scholar
  40. 40.
    Wouters, P., Reddy, C.: Big science data policies. In: Wouters, P., Schröder, P. (eds.) Promise and Practice in Data Sharing, NIWI-KNAW, Amsterdam, pp. 13–40 (2003)Google Scholar
  41. 41.
    Zimmerman, A.: Data Sharing and Secondary Use of Scientific Data: Experiences of Ecologists. Unpublished dissertation, University of Michigan, Ann Arbor (2003)Google Scholar
  42. 42.
    Zimmerman, A.: New knowledge from old data: the role of standards in the sharing and reuse of ecological data. Sci Technol Human Values (2008, in press)Google Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Collaboratory for Research on Electronic Work, School of InformationUniversity of MichiganAnn ArborUSA

Personalised recommendations