Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries

  • Christine L. Borgman
  • Jillian C. Wallis
  • Noel Enyedy
REGULAR PAPER

Abstract

e-Science promises to increase the pace of science via fast, distributed access to computational resources, analytical tools, and digital libraries. “Big science” fields such as physics and astronomy that collaborate around expensive instrumentation have constructed shared digital libraries to manage their data and documents, while “little science” research areas that gather data through hand-crafted fieldwork continue to manage their data locally. As habitat ecology researchers begin to deploy embedded sensor networks, they are confronting an array of challenges in capturing, organizing, and managing large amounts of data. The scientists and their partners in computer science and engineering make use of common datasets but interpret the data differently. Studies of this field in transition offer insights into the role of digital libraries in e-Science, how data practices evolve as science becomes more instrumented, and how scientists, computer scientists, and engineers collaborate around data. Among the lessons learned are that data on the same variables are gathered by multiple means, that data exist in many states and in many places, and that publication practices often drive data collection practices. Data sharing is embraced in principle but little sharing actually occurs, due to interrelated factors such as lack of demand, lack of standards, and concerns about publication, ownership, data quality, and ethics. We explore the implications of these findings for data policy and digital library architecture. Research reported here is affiliated with the Center for Embedded Networked Sensing.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akyildiz I.F., Su W., Sankarasubramaniam Y. and Cayirci E. (2002). Wireless sensor networks: A survey. Comput. Networks 38: 393–422 CrossRefGoogle Scholar
  2. 2.
    Allaby, M. (ed.) (2006). A Dictionary of Ecology, 3rd ed. Oxford University Press, Oxford Google Scholar
  3. 3.
    ArXiv.org e-Print archive (2006). http://arxiv.org/. Visited 27 April 2006
  4. 4.
    Arzberger P., Schroeder P., Beaulieu A., Bowker G., Casey K., Laaksonen L., Moorman D., Uhlir P. and Wouters P. (2004). An international framework to promote access to data. Science 303(5665): 1777–1778 CrossRefGoogle Scholar
  5. 5.
    Bekaert, J., Van de Sompel, H.: Access interfaces for open archival information systems based on the OAI-PMH and the OpenURL framework for context-sensitive services. In: PV 2005: Ensuring Long-term Preservation and Adding Value to Scientific and Technical data. The Royal Society, Edinburgh (2006). http://www.ukoln.ac.uk/events/pv-2005/pv-2005-final-papers/032.pdf. Visited 28 Sept 2006
  6. 6.
    Bishop, A.P., Van House, N., Buttenfield, B.P. (eds.) (2003). Digital Library Use: Social Practice in Design and Evaluation. MIT Press, Cambridge, MA Google Scholar
  7. 7.
    Borgman, C.L.: The Interaction of community and individual practices in the design of a digital library. In: International Symposium on Digital Libraries and Knowledge Communities in Networked Information Society, University of Tsukuba, Tsukuba, Ibaraki, Japan (2004). http://www.kc.tsukuba.ac.jp/dlkc/e-proceedings/papers/dlkc04pp9.pdf. Visited 10 April 2006
  8. 8.
    Borgman C.L. (2006). What can studies of e-Learning teach us about e-Research? Some findings from digital library research. J. Comput. Supp. Cooperative Work 15(4): 359–383 CrossRefGoogle Scholar
  9. 9.
    Borgman C.L. (2007). Scholarship in the Digital Age: Information, Infrastructure and the Internet. MIT Press, Cambridge, MA Google Scholar
  10. 10.
    Borgman, C.L., Wallis, J.C., Enyedy, N.: Building digital libraries for scientific data: An exploratory study of data practices in habitat ecology. In: 10th European Conference on Digital Libraries, Alicante, Spain, Springer (2006)Google Scholar
  11. 11.
    Borgman, C.L., Bates, M.J., Cloonan, M.V., Efthimiadis, E.N., Gilliland-Swetland, A. J., Kafai, Y., Leazer, G. L., Maddox, A.: Social Aspects of Digital Libraries. Final Report to the National Science Foundation; Computer, Information Science, and Engineering Directorate; Division of Information, Robotics, and Intelligent Systems; Information Technology and Organizations Program (1996). http://is.gseis.ucla.edu/research/dl/index.html. Visited 28 Sept 2006
  12. 12.
    Botts, M.: Sensor Model Language (SensorML) for in-situ and remote sensors: Version 1.0.0 beta. Recommended paper, no. OGC 04-019r2.Open Geospatial Consortium (2004)Google Scholar
  13. 13.
    Botts, M.: Sensor Modelling Language (SensorML) Status (2006). http://stromboli.nsstc.uah.edu/SensorML/status.html. Visited 20 Nov 2006
  14. 14.
    Botts, M., McKee, L.: A Sensor Model Language: Moving sensor data onto the Internet. Sensors 20(Issue) (2003). Visited http://www.sensorsmag.com/articles/0403/30/main.shtml. Visited 20 Nov 2006
  15. 15.
    Bowker G.C. (2000). Mapping biodiversity. Int. J. Geograph. Inform. Sci. 14(8): 739–754 CrossRefGoogle Scholar
  16. 16.
    Bowker G.C. (2000). Biodiversity datadiversity. Social Studies of Science 30(5): 643–683 CrossRefGoogle Scholar
  17. 17.
    Bowker, G.C.: Work and information practices in the sciences of biodiversity. In: VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, Cairo, Egypt, Kaufmann (2000)Google Scholar
  18. 18.
    Bowker G.C. (2005). Memory Practices in the Sciences. MIT Press, Cambridge, MA Google Scholar
  19. 19.
    Chudnov, D., Cameron, R., Frumkin, J., Singer, R., Yee, R.: Opening up openURLs with autodiscovery. Ariadne (43) (2005). http://www.ariadne.ac.uk/issue43/chudnov/. Retrieved 29 Sept 2006
  20. 20.
    Collaborative Large-Scale Engineering Analysis Network for Environmental Research (2006). http://cleaner.ncsa.uiuc.edu/home/. Visited 16 Aug 2006
  21. 21.
    Consortium of Universities for Advancement of Hydrologic Science. (2006) http://www.cuahsi.org. Visited 15 Nov 2006
  22. 22.
    Culler D.E. and Hong W. (2004). Wireless sensor networks. Commun. ACM 47(6): 30–33 CrossRefGoogle Scholar
  23. 23.
    Cummings, J. N., Kiesler, S.: Collaborative research across disciplinary and organizational boundaries. Soc. Stud. Sci. 35(5), 703–722 (2005). Retrieved from < Go to ISI>://000232598300003Google Scholar
  24. 24.
    David, P.A., Spence, M.: Towards institutional infrastructures for e-Science: the scope of the challenge. Oxford Internet Institute Research Reports, University of Oxford. http://129.3.20.41/eps/le/papers/0502/0502002.pdf. Visited 30 Sept 2006
  25. 25.
    Document Action: ‘The “info” URI Scheme for Information Assets with Identifiers in Public Namespaces’ to Informational RFC: Internet Engineering Task Force (2005). http://www1.ietf.org/mail-archive/web/ietf-announce/current/msg01746.html. Visited 8 Mar 2006
  26. 26.
    Ecological Metadata Language (2004). http://knb.ecoinformatics.org/software/eml/. Visited 25 Nov 2004
  27. 27.
    Elson J. and Estrin D. (2004). Sensor networks: a bridge to the physical world. In: Raghavendra, C.S., Sivalingam, K.M. and Znati, T.F. (eds) Wireless Sensor Networks. Kluwer Academic, Boston Google Scholar
  28. 28.
    Engeström Y. (1987). Learning by Expanding: An Activity-Theoretical Approach to Developmental Research. Orienta-Konsultit, Helsinki Google Scholar
  29. 29.
    Engeström, Y., Miettinen, R., Punamaeki , R.-L. (eds.) (1999). Perspectives on Activity Theory. Cambridge University Press, New York Google Scholar
  30. 30.
    Esanu, J.M., Uhlir, P.F. (eds.): Open Access and the Public Domain in Digital Data and Information for Science: Proceedings of an International Symposium. The National Academies Press, Washington, DC (2004) http://books.nap.edu/catalog/11030.html. Visited 30 Sept 2006
  31. 31.
    Estrin, D., Michener, W.K., Bonito,G.: Environmental cyberinfrastructure needs for distributed sensor networks: A report from a National Science Foundation sponsored workshop. Scripps Institute of Oceanography (2003). http://www.lternet.edu/sensor_report/. Visited 12 May 2006
  32. 32.
    Finholt, T.A.: Collaboratories as a new form of scientific organization. Econ. Innovat. New Technol. 12(January) (2003)Google Scholar
  33. 33.
    Ginsparg, P.: Creating a global knowledge network. Second Joint ICSU Press – UNESCO Expert Conference on Electronic Publishing in Science, Paris, UNESCO (2001). http://people.ccmr.cornell.edu/∼ginsparg/blurb/pg01unesco.html. Visited 12 May 2006
  34. 34.
    Glaser B.G. and Strauss A.L. (1967). The Discovery of Grounded Theory; Strategies for Qualitative Research. Aldine Pub. Co., Chicago Google Scholar
  35. 35.
    Global Earth Observation System of Systems (2006). http://www.epa.gov/geoss/. Visited 30 April 2006
  36. 36.
    Global Ocean Observing System (2006). http://www.ioc-goos.org/. Visited 5 June 2006
  37. 37.
    Godby, C.J., Young, J.A., Childress, E.: A repository of metadata crosswalks. D-Lib Mag. 10(12) (2004). http://www.dlib.org/dlib/december04/godby/12godby.html. Retrieved 22 May 2006
  38. 38.
    Gray, J., Liu, D.T., Nieto-Santisteban, M., Szalay, A., DeWitt, D., Heber, G.: Scientific data management in the coming decade. CT Watch Quart. 1(1) (2005). http://www.ctwatch.org/quarterly/articles/2005/02/scientific-data-management/. Retrieved 25 Aug 2006
  39. 39.
    Hey, T., Trefethen, A.: The data deluge: an e-Science perspective. Grid Computing—Making the Global Infrastructure a Reality. Wiley (2003). http://www.rcuk.ac.uk/escience/documents/report_datadeluge.pdf. Visited 20 Jan 2005
  40. 40.
    Hey T. and Trefethen A. (2005). Cyberinfrastructure and e-Science. Science 308: 818–821 CrossRefGoogle Scholar
  41. 41.
    Hilgartner S. and Brandt-Rauf S.I. (1994). Data access, ownership and control: toward empirical studies of access practices. Knowledge 15: 355–372 Google Scholar
  42. 42.
    Hodge, G., Frangakis, E.: Digital preservation and permanent access to scientific information: the state of the practice. In: The International Council for Scientific and Technical Information (ICSTI) and CENDI: U.S. Federal Information Managers Group (2005). http://cendi.dtic.mil/publications/04-3dig_preserv.html. Visited 30 Sept 2006
  43. 43.
    Integrated Ocean Observing System (2006). http://www.ocean.us/what_is_ioos. Visited 5 June 2006
  44. 44.
    International Virtual Observatory Alliance (2006). http://www.ivoa.net/. Visited 30 Sept 2006
  45. 45.
    James San Jacinto Mountains Reserve: University of California (2004). http://www.jamesreserve.edu/. Visited 30 Nov 2004
  46. 46.
    Kwa C. (2005). Local ecologies and global science: discourses and strategies of the International Geosphere-Biosphere Programme. Soc. Stud. Sci. 35(6): 923–950 CrossRefGoogle Scholar
  47. 47.
    Latour B. (1987). Science in Action: How to Follow Scientists and Engineers through Society. Harvard University Press, Cambridge, MA Google Scholar
  48. 48.
    Latour B. and Woolgar S (1986). Laboratory Life: The Construction of Scientific Facts, 2nd edn. Princeton University Press, Princeton Google Scholar
  49. 49.
    Lofland J., Snow D., Anderson L. and Lofland L.H. (2006). Analyzing Social Settings: A Guide to Qualitative Observation and Analysis. Wadsworth/Thomson Learning, Belmont, CA Google Scholar
  50. 50.
    Long-Lived Digital Data Collections: Enabling Research and Education for the 21st Century: National Science Board (2005). http://www.nsf.gov/nsb/documents/2005/LLDDC_report.pdf. Visited 1 October 2006
  51. 51.
    Lord, P., Macdonald, A.: e-Science Curation Report-Data Curation for e-science in the UK: An Audit to Establish Requirements for Future Curation and Provision. JISC Committee for the Support of Research (2003). http://www.jisc.ac.uk/uploaded_documents/e-scienceReportFinal.pdf.Visited 1 October 2006
  52. 52.
    Maurer J. (2004). Models of Scientific Inquiry and Statistical Practice: Implications for the structure of scientific knowledge. In: Taper, M.L., Lele, S.R. and Znati, T.F. (eds) The Nature of Scientific Evidence: Statistical, philosophical, and empirical considerations., pp 17–50. The University of Chicago Press, Chicago, London Google Scholar
  53. 53.
    Michener, W.K., Brunt, J.W. (eds.) (2000). Ecological Data: Design, Management and Processing. Blackwell Science, Oxford Google Scholar
  54. 54.
    National Ecological Observatory Network (2006). http://neoninc.org/. Visited 3 October 2006
  55. 55.
    National Office for Integrated and Sustained Ocean Observations (2006). http://www.ocean.us/. Visited 5 June 2006
  56. 56.
    Object Reuse and Exchange (2006). http://www.openarchives.org/ore/. Visited 15 Nov 2006
  57. 57.
    Olson, G.M.: Long distance collaborations in science: Challenges and opportunities. In: First International Conference on e-Social Science, Manchester, UK, National Center for e-Social Science (2005) http://www.ncess.ac.uk/events/conference/ 2005/papers/presentations/ncess2005_olson.pdf. Visited 20 Nov 2006
  58. 58.
    Olson G.M. and Olson J.S. (2000). Distance matters. Human–Comput Interact. 15(2–3): 139–178 CrossRefGoogle Scholar
  59. 59.
    Pottie G.J. and Kaiser W.J. (2000). Wireless integrated network sensors. Commun. ACM 43(5): 51–58 CrossRefGoogle Scholar
  60. 60.
    Pottie G.J. and Kaiser W.J. (2006). Principles of Embedded Networked Systems Design. Cambridge University Press, Cambridge Google Scholar
  61. 61.
    Price D.J.d.S. (1963). Little Science, Big Science. Columbia University Press, New York Google Scholar
  62. 62.
    Pritchard, S.M., Carver, L., Anand, S.: Collaboration for knowledge management and campus informatics. University of California, Santa Barbara (2004). http://www.library.ucsb.edu/informatics/informatics/documents/UCSB_Campus_ Informatics_Project_Report.pdf. Visited 5 July 2006
  63. 63.
    Reference Model for an Open Archival Information System: Recommendation for Space Data System Standards, Consultative Committee for Space Data Systems Secretariat, Program Integration Division (Code M-3), National Aeronautics and Space Administration (2002). http://public.ccsds.org/publications/archive/650x0b1.pdf. Visited 4 October 2006
  64. 64.
    Sandoval W.A. and Reiser B.J. (2003). Explanation-driven inquiry: integrating conceptual and epistemic supports for science inquiry. Sci. Educat. 87: 1–29 CrossRefGoogle Scholar
  65. 65.
    Schnase, J.L., Lane, M.A., Bowker, G.C., Star, S.L., Silberschatz, A.: Building the next generation biological information infrastructure. In: Raven, P.H., Williams, T., (eds.). Nature and Human Society: The Quest for a Sustainable World. National Academy Press, Washington, DC, pp. 291–300. http://darwin.nap.edu/books/0309065550/html. Visited 4 October 2006
  66. 66.
    Sensor Modeling Language (2005). http://vast.uah.edu/SensorML/. Visited 16 Jan2006
  67. 67.
    Shankar, K.: Scientific data archiving: the state of the art in information, data, and metadata management (2003). http://cens.ucla.edu/Education/index.html. Visited 19 Jan 2005
  68. 68.
    Sharing Data from Large-scale Biological Research Projects: a System of Tripartite Responsibility: Meeting organized by the Wellcome Trust, Fort Lauderdale, Florida, Wellcome Trust (2003). http://www.wellcome.ac.uk/assets/wtd003207.pdf. Visited 25 July 2005
  69. 69.
    Sonnenwald, D.H.: Scientific collaboration: challenges and solutions. In: Cronin, B. (ed.) Annual Review of Information Science and Technology. Information Today, Medford, NJ, Vol. 41, pp. 643–682 (2007)Google Scholar
  70. 70.
    Southern California Coastal Ocean Observing System (2006). http://www.sccoos.org\ . Visited 5 June 2006
  71. 71.
    Thadani, V., Cook, M., Millwood, K., Harven, A., Fields, D., Griffis, K., Wise, J., Kim, K., Sandoval, W.A.: Eyes on the prize: considering how design research can lead to sustainable innovation. Paper presented at the Annual Meeting of the American Educational Research Assn., San Francisco, April 7–12 (2006)Google Scholar
  72. 72.
    Traweek S. (1992). Beamtimes and Lifetimes: The World of High Energy Physicists, 1st Harvard University Press pbk. edn. Harvard University Press, Cambridge Google Scholar
  73. 73.
    Traweek S. (2004). Generating high energy physics in Japan. In: Kaiser, D. (eds) Pedagogy and Practice in Physics., pp. University of Chicago Press, Chicago Google Scholar
  74. 74.
    U.S. Long Term Ecological Research Network (2006). http://lternet.edu/. Visited 5 June 2006
  75. 75.
    Van de Sompel, H., Nelson, M.L., Lagoze, C., Warner, S.: Resource harvesting within the OAI-PMH framework. D-Lib Mag. 10(12) (2004). http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html. Retrieved 5 October 2006
  76. 76.
    Van de Sompel, H., Hammond, T., Neylon, E., Weibel, S.L.: RFC 4452: The “info” URI Scheme for Information Assets with Identifiers in Public Namespaces. Requests for Comments, Internet Engineering Task Force (2006). http://www.rfc-archive.org/getrfc.php?rfc=4452. Visited 5 October 2006
  77. 77.
    Van House, N.A. (2003). Digital libraries and collaborative knowledge construction. In: Bishop, A.P. and Buttenfield, B.P. (eds) Digital Library Use: Social Practice in Design and Evaluation., pp 271–296. MIT Press, Cambridge Google Scholar
  78. 78.
    Wallis, J.C., Milojevic, S., Borgman, C.L., Sandoval, W.A.: The special case of scientific data sharing with education. American Society for Information Science& Technology. Information Today, Austin (2006)Google Scholar
  79. 79.
    Weinberg A.M. (1961). Impact of large-scale science on the United States. Science 134(3473): 161–164 CrossRefGoogle Scholar
  80. 80.
    Zimmerman, A.: (in press). Not by metadata alone: The use of diverse forms of knowledge to locate data for reuse. Int. J. Dig. Lib.Google Scholar
  81. 81.
    Zimmerman, A., Nardi, B.: Whither or whether HCI: Requirements analysis for multi-sited, multi-user cyberinfrastructures. CHI 2006, Montreal, Association for Computing Machinery, pp. 1601–1606 (2006)Google Scholar
  82. 82.
    Zimmerman, A.S.: Data Sharing and Secondary Use of Scientific Data: Experiences of Ecologists. School of Information, University of Michigan, Ann Arbor, MI (2003)Google Scholar
  83. 83.
    Zimmerman, A.S.: (in press). New knowledge from old data. The role of standards in the sharing and reuse of ecological data. Sci. Technol. Human ValuesGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Christine L. Borgman
    • 1
  • Jillian C. Wallis
    • 2
  • Noel Enyedy
    • 3
  1. 1.Department of Information StudiesGraduate School of Education & Information Studies, UCLALos AngelesUSA
  2. 2.Center for Embedded Networked Sensing, UCLALos AngelesUSA
  3. 3.Department of EducationGraduate School of Education & Information Studies, UCLALos AngelesUSA

Personalised recommendations