Skip to main content
Log in

Who’s Got the Data? Interdependencies in Science and Technology Collaborations

  • Published:
Computer Supported Cooperative Work (CSCW) Aims and scope Submit manuscript

Abstract

Science and technology always have been interdependent, but never more so than with today’s highly instrumented data collection practices. We report on a long-term study of collaboration between environmental scientists (biology, ecology, marine sciences), computer scientists, and engineering research teams as part of a five-university distributed science and technology research center devoted to embedded networked sensing. The science and technology teams go into the field with mutual interests in gathering scientific data. “Data” are constituted very differently between the research teams. What are data to the science teams may be context to the technology teams, and vice versa. Interdependencies between the teams determine the ability to collect, use, and manage data in both the short and long terms. Four types of data were identified, which are managed separately, limiting both reusability of data and replication of research. Decisions on what data to curate, for whom, for what purposes, and for how long, should consider the interdependencies between scientific and technical processes, the complexities of data collection, and the disposition of the resulting data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2

Similar content being viewed by others

References

  • Aronova, E., K. S. Baker, and N. Oreskes (2010): Big Science and Big Data in Biology: From the International Geophysical Year through the International Biological Program to the Long Term Ecological Research (LTER) Network, 1957–Present. Historical Studies in the Natural Sciences, vol. 40, no. 2, pp. 183–224.

    Article  Google Scholar 

  • Arzberger, P., P. Schroeder, A. Beaulieu, G. C. Bowker, K. Casey, L. Laaksonen, D. Moorman, P. F. Uhlir, and P. Wouters (2004a): An International Framework to Promote Access to Data. Science, vol. 303, no. 5665, pp. 1777–1778.

    Article  Google Scholar 

  • Arzberger, P., P. Schroeder, A. Beaulieu, G. C. Bowker, K. Casey, L. Laaksonen, D. Moorman, P. F. Uhlir, and P. Wouters (2004b): Promoting Access to Public Research Data for Scientific, Economic, and Social Development. Data Science Journal, vol. 3, pp. 135–152.

    Google Scholar 

  • Basili, V. R., and M. V. Zelkowitz (2007): Empirical studies to build a science of computer science. Communications of the ACM, vol. 50, no. 11, pp. 33–37.

    Article  Google Scholar 

  • Batalin, M. A., M. Rahimi, Y. Yu, D. Liu, A. Kansal, G. S. Sukhatme, W. J. Kaiser, M. Hansen, G. J. Pottie, M. Srivastava, and D. Estrin (2004): Call and Response: Experiments in Sampling the Environment. Proceedings of the 2nd international conference on Embedded networked sensor systems, Los Angeles. New York, NY: ACM Press. pp. 25–38.

  • Berman, H. M., J. Westbrook, J. Feng, G. Gilliland, T. N. Bhat, H. Wessig, I. N. Shindyalov, and P. E. Bourne (2000): The Protein Data Bank. Nucleic Acids Research, vol. 28, pp. 235–242.

    Article  Google Scholar 

  • Borgman, C. L. (2007): Scholarship in the Digital Age: Information, Infrastructure, and the Internet. Cambridge, MA: MIT Press.

    Google Scholar 

  • Borgman, C. L. (2012): The conundrum of sharing research data. Journal of the American Society for Information Science and Technology. http://dx.doi.org/10.1002/asi.22634.

  • Borgman, C. L., J. C. Wallis, and N. Enyedy (2006a): Building digital libraries for scientific data: An exploratory study of data practices in habitat ecology. 10th European Conference on Digital Libraries, Alicante, Spain. Berlin: Springer. pp. 170–183.

  • Borgman, C. L., J. C. Wallis, N. Enyedy, and M. S. Mayernik (2006b): Capturing habitat ecology in reusable forms: A case study with embedded networked sensor technology. Annual Meeting of the Society for the Social Studies of Science, Vancouver, BC. http://works.bepress.com/borgman/226/.

  • Borgman, C. L., J. C. Wallis, and N. Enyedy (2007a): Little Science confronts the data deluge: Habitat ecology, embedded sensor networks, and digital libraries. International Journal on Digital Libraries, vol. 7, nos. 1–2, pp. 17–30.

  • Borgman, C. L., J. C. Wallis, M. S. Mayernik, and A. Pepe (2007b): Drowning in data: Digital library architecture to support scientific use of embedded sensor networks. Joint Conference on Digital Libraries, Vancouver, British Columbia, Canada. Association for Computing Machinery. pp. 269–277.

  • Bourne, P. (2005): Will a biological database be different from a biological journal? PLoS Computational Biology, vol. 1, no. 3, pp. e34.

    Article  MathSciNet  Google Scholar 

  • Bowen, G. M., and W.-M. Roth (2007): The practice of field ecology: Insights for science education. Research in Science Education, vol. 37, no. 2, pp. 171–187.

    Article  Google Scholar 

  • Carver, J., L. Hochstein, R. Kendall, T. Nakamura, M. Zelkowitz, V., Basili, and D. Post (2006): Observations about software development for high-end computing. Cyberinfrastructure Technology Watch Quarterly, vol. 2, no. 4a, pp. 33–38.

  • Chen, J. C., J. Elson, H. Wang, D. Maniezzo, R. E. Hudson, K., Yao, and D. Estrin (2003): Coherent Acoustic Array Processing and Localization on Wireless Sensor Networks. Proceedings of the IEEE, vol. 91, no. 8, pp. 1154–1162.

    Article  Google Scholar 

  • Cole, F. T. H. (2008): Taking “Data” (as a Topic): The Working Policies of Indifference, Purification and Differentiation. 19th Australasian Conference on Information Systems, Christchurch, NZ. pp. 240–249.

  • Collins, H. M. (1975): The seven sexes: A study in the sociology of a phenomenon, or the replication of experiments in physics. Sociology, vol. 9, pp. 205–224.

    Article  Google Scholar 

  • Collings, H. M. (1998): The Meaning of Data: Open and Closed Evidential Cultures in the Search for Gravitational Waves. American Journal of Sociology, vol. 104, no. 2, pp. 293–338.

    Article  Google Scholar 

  • Cragin, M. H., and K. Shankar (2006): Scientific data collections and distributed collective practice. Computer Supported Cooperative Work, vol. 15, pp. 185–204.

    Article  Google Scholar 

  • National Science Foundation (2007): Cyberinfrastructure Vision for 21st Century Discovery. http://www.nsf.gov/pubs/2007/nsf0728/nsf0728.pdf.

  • de Souza, C., J. Froehlich, and P. Dourish (2005): Seeking the source: software source code as a social and technical artifact. Proceedings of the 2005 international ACM SIGGROUP Conference, Sanibel Island, Florida, Association for Computing Machinery. pp. 197–206.

  • Easterbrook, S. M., and T. C. Johns (2009): Engineering the software for understanding climate change. Computing in Science & Engineering, vol. 11, no. 6, pp. 64–74.

    Article  Google Scholar 

  • Edwards, P. N., M. S. Mayernik, A. L. Batcheller, G. C. Bowker, and C. L. Borgman (2011): Science Friction: Data, Metadata, and Collaboration. Social Studies of Science, vol. 41, no. 5, pp. 667–690.

    Article  Google Scholar 

  • Embedded, Everywhere: A Research Agenda for Networked Systems of Embedded Computers (2001): Washington, D.C.: National Academy Press. http://www.nap.edu/openbook.php?record_id=10193.

  • Estrin, D., W. K. Michener, and G. Bonito (2003): Environmental cyberinfrastructure needs for distributed sensor networks: A report from a National Science Foundation sponsored workshop. Scripps Institute of Oceanography. http://www.lternet.edu/sensor_report/.

  • Faniel, I. M., and T. E. Jacobsen (2010): Reusing Scientific Data: How Earthquake Engineering Researchers Assess the Reusability of Colleagues’ Data. Journal of Computer-Supported Cooperative Work, vol. 19, nos. 3–4, pp. 355–375.

  • Fry, J. (2006): Scholarly research and information practices: A domain analytic approach. Information Processing and Management, vol. 2006, no. 42, pp. 299–316.

    Article  Google Scholar 

  • GEON. (2010): http://www.geongrid.org/. Accessed 20 August 2010.

  • Giere, R. N. (1999): Science without Laws. Chicago: University of Chicago Press.

    Google Scholar 

  • Hamilton, M. P., E. A. Graham, P. W. Rundel, M. F. Allen, W. Kaiser, M. H. Hansen, and D. L. Estrin (2007): New Approaches in Embedded Networked Sensing for Terrestrial Ecological Observatories. Environmental Engineering Science, vol. 24, no. 2.

  • Harmon, T. C., R. F. Ambrose, R. M. Gilbert, J. C. Fisher, M. Stealey, and W. J. Kaiser (2007): High-Resolution River Hydraulic and Water Quality Characterization Using Rapidly Deployable Networked Infomechanical Systems (NIMS RD). Environmental Engineering Science, vol. 24, no. 2, pp. 151–159.

    Article  Google Scholar 

  • Jackson, S. J., D. Ribes, and A. Buyuktur (2010): Exploring Collaborative Rhythm: Temporal Flow and Alignment in Collaborative Scientific Work. iConference 2010, Urbana-Champagne, IL. http://www.ideals.illinois.edu/handle/2142/14955.

  • Jirotka, M., R. Procter, T. Rodden, and G. C. Bowker (2006): Special Issue: Collaboration in e-Research. Computer Supported Cooperative Work, vol. 15, pp. 251–255.

    Article  Google Scholar 

  • Kanfer, A. G., C. Haythornthwaite, B. C. Bruce, G. C. Bowker, N. C. Burbules, J. F. Porac, and J. Wade (2000): Modeling distributed knowledge processes in next generation multidisciplinary alliances. Information Systems Frontiers, vol. 2, nos. 3–4, pp. 317–331.

  • Karasti, H., K. S. Baker, and E. Halkola (2006): Enriching the notion of data curation in e-Science: Data managing and information infrastructuring in the Long Term Ecological Research (LTER) Network. Journal of Computer-Supported Cooperative Work, vol. 15, no. 4, pp. 321–358.

    Article  Google Scholar 

  • Karasti, H., K. S. Baker, and F. Millerand (2010): Infrastructure Time: Long-term Matters in Collaborative Development. Computer Supported Cooperative Work, vol. 19, nos. 3–4, pp. 377–415.

  • Kwa, C. (2005): Local ecologies and global science: Discourses and strategies of the International Geosphere-Biosphere Programme. Social Studies of Science, vol. 35, no. 6, pp. 923–950.

    Article  Google Scholar 

  • Latour, B. (1987): Science in Action: How to Follow Scientists and Engineers through Society. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Lawrence, K. A. (2006): Walking the Tightrope: The Balancing Acts of a Large e-Research Project. Computer Supported Cooperative Work, vol. 15, pp. 385–411.

    Article  MathSciNet  Google Scholar 

  • Lee, C. P., P. Dourish, and G. Mark (2006): The human infrastructure of cyberinfrastructure. Proceedings of the Conference on Computer-Supported Cooperative Work, Banff, Alberta, Association for Computing Machinery. pp. 483–492.

  • Lee, C. P., D. Ribes, M. Bietz, M. Jirotka, and H. Karasti (2010): Supporting Scientific Collaboration Through Cyberinfrastructure and e-Science: Special issue. Computer Supported Cooperative Work, vol. 19, nos. 3–4.

  • Long-Lived Digital Data Collections (2005): National Science Board.

  • Maurer, B. A. (2004): Models of Scientific Inquiry and Statistical Practice: Implications for the structure of scientific knowledge. In Taper, M. L., and Lele, S. R. (Eds.). The Nature of Scientific Evidence: Statistical, philosophical, and empirical considerations. Chicago, London, The University of Chicago Press, pp. 17–50.

    Google Scholar 

  • Mayernik, M. S. (2011): Metadata Realities for Cyberinfrastructure: Data Authors as Metadata Creators. PhD Dissertation. Information Studies. UCLA. Los Angeles, CA.

  • Mayernik, M. S., A. L. Batcheller, and C. L. Borgman (2011): How Institutional Factors Influence the Creation of Scientific Metadata. iConference, Seattle, WA, Association for Computing Machinery.

  • Mayernik, M. S., J. C. Wallis, and C. L. Borgman (in review): Unearthing the infrastructure: Humans and sensors in environmental and ecological field research.

  • Mun, M., S. Reddy, K. Shilton, N. Yau, J. Burke, D. Estrin, M. Hansen, E. Howard, R. West, and P. Boda (2009): PEIR, the Personal Environmental Impact Report, as a Platform for Participatory Sensing Systems Research. Proceedings of the 7th International Conference on Mobile Systems, Applications, and Service, Krakow, Poland. pp. 55–68.

  • National Ecological Observatory Network (2010): http://www.neoninc.org/. Accessed 20 August 2010.

  • NIMS: Networked Infomechanical Systems (2006): http://www.cens.ucla.edu/portal/nims. Accessed 3 October 2006.

  • Pepe, A. (2010): Structure and Evolution of Scientific Collaboration Networks in a Modern Research Collaboratory. Doctoral. Information Studies. UCLA. Los Angeles, CA.

    Google Scholar 

  • Pepe, A., and M. A. Rodriguez (2010): Collaboration in sensor network research: an in-depth longitudinal analysis of assortative mixing patterns. Scientometrics, vol. 84, no. 3, pp. 687–701.

    Article  Google Scholar 

  • Pon, R., M. Maxim Batalin, J. Gordon, M. H. Rahimi, W. Kaiser, G. S. Sukhatme, M. Srivastava, and D. Estrin (2005): Networked Infomechanical Systems: A Mobile Wireless Sensor Network Platform. IEEE/ACM Fourth International Conference on Information Processing in Sensor Networks (IPSN-SPOTS). pp. 376–381.

  • Protein Data Bank (2006): http://www.rcsb.org/pdb/. Accessed 4 October 2006.

  • A Question of Balance: Private Rights and the Public Interest in Scientific and Technical Databases (1999): Washington, DC: National Academy Press.

  • Rahimi, M. H., W. Kaiser, G. S. Sukhatme, and D. Estrin (2005): Adaptive sampling for environmental field estimation using robotic sensors. IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 3692–3698.

  • Renear, A. H., S. Sacchi, and K. M. Wickett (2010): Definitions of Dataset in the Scientific and Technical Literature. Proceedings of the Annual Meeting of the American Society for Information Science and Technology, vol. 47, no. 1, pp. 1–4.

    Article  Google Scholar 

  • Ribes, D., and T. A. Finholt (2007): Tensions across the scales: Planning infrastructure for the long-term. Proceedings of the 2007 International ACM SIGGROUP Conference on Supporting Group Work, Sanibel Island, Florida, USA, Sanibel Island, Florida, Association for Computing Machinery. pp. 229–238.

  • Ribes, D., and C. P. Lee (2010): Sociotechnical Studies of Cyberinfrastructure and e-Research: Current Themes and Future Trajectories. Computer Supported Cooperative Work, vol. 19, nos. 3–4, pp. 231–244.

  • Segal, J. (2005): When software engineers met research scientists: A case study. Empirical Software Engineering, vol. 10, pp. 517–536.

    Article  Google Scholar 

  • Segal, J. (2009): Software Development Cultures and Cooperation Problems: A Field Study of the Early Stages of Development of Software for a Scientific Community. Computer Supported Cooperative Work, vol. 18, no. 5–6, pp. 581–606.

    Article  Google Scholar 

  • Shrum, W., J. Genuth, and I. Chompalov (2007): Structures of Scientific Collaboration. Cambridge, MA: MIT Press.

    Google Scholar 

  • Star, S. L., and J. Griesemer (1989): Institutional ecology, "translations," and boundary objects: Amateurs and professionals in Berkeley's Museum of Vertebrate Zoology, 1907-1939. Social Studies of Science, vol. 19, no. 3, pp. 387–420.

    Article  Google Scholar 

  • Sutton, C. (2003): UCLA Develops Mobile Sensing System for Enriched Monitoring of the Environment. UCLA. Los Angeles, CA.

    Google Scholar 

  • Szewczyk, R., E. Osterweil, J. Polastre, M. Hamilton, A. Mainwaring, and D. Estrin (2004): Habitat monitoring with sensor networks. Communications of the ACM, vol. 47, no. 6, pp. 34–40.

    Article  Google Scholar 

  • Traweek, S. (1992): Beamtimes and Lifetimes: The World of High Energy Physicists (1st Harvard University Press pbk. ed.). Cambridge, Mass.: Harvard University Press.

    Google Scholar 

  • Traweek, S. (2004): Generating high energy physics in Japan. In Kaiser, D. (Ed.). Pedagogy and Practice in Physics. Chicago, University of Chicago Press.

  • Turner, W., G. C. Bowker, L. Gasser, and M. Zacklad (2006): Information Infrastructures for Distributed Collective Practices. Computer Supported Cooperative Work, vol. 15, pp. 93–110.

    Article  Google Scholar 

  • U.S. Long Term Ecological Research Network (2010): http://lternet.edu/. Accessed 20 August 2010.

  • Voorhees, E. M. (2007): TREC: Continuing information retrieval's tradition of experimentation. Communications of the ACM, vol. 50, no. 11, pp. 51–54.

    Article  Google Scholar 

  • Voorhees, E. M., and D. K. Harman (eds.). (2005): TREC: Experiment and Evaluation in Information Retrieval. Cambridge, MA: MIT Press.

    Google Scholar 

  • Wallis, J. C., C. L. Borgman, M. S. Mayernik, and A. Pepe (2008a): Moving archival practices upstream: An exploration of the life cycle of ecological sensing data in collaborative field research. International Journal of Digital Curation, vol. 3, no. 1.

  • Wallis, J. C., A. Pepe, M. S. Mayernik, and C. L. Borgman (2008b): An exploration of the life cycle of eScience collaboratory data. iConference 2008, Los Angeles, CA. http://hdl.handle.net/2142/15122.

  • Wallis, J. C., M. S. Mayernik, C. L. Borgman, and A. Pepe (2010): Digital Libraries for Scientific Data Discovery and Reuse: From Vision to Practical Reality. Joint Conference on Digital Libraries, Gold Coast, Queensland, Australia, Association for Computing Machinery.

Download references

Acknowledgements

Research reported here is supported in part by grants from the National Science Foundation (NSF): (1) The Center for Embedded Networked Sensing (CENS) is funded by NSF Cooperative Agreement #CCR-0120778, Deborah L. Estrin, UCLA, Principal Investigator; (2) CENS Education Infrastructure (CENSEI), under which much of this research was conducted, is funded by National Science Foundation grant #ESI-0352572, William A. Sandoval, Principal Investigator and Christine L. Borgman, co-Principal Investigator. (3) Towards a Virtual Organization for Data Cyberinfrastructure, #OCI-0750529, C.L. Borgman, UCLA, PI; G. Bowker, Santa Clara University, Co-PI; Thomas Finholt, University of Michigan, Co-PI; (4) Monitoring, Modeling & Memory: Dynamics of Data and Knowledge in Scientific Cyberinfrastructures: #0827322, P.N. Edwards, UM, PI; Co-PIs C.L. Borgman, UCLA; G. Bowker, SCU; T. Finholt, UM; S. Jackson, UM; D. Ribes, Georgetown; S.L. Star, SCU.

We also are grateful to Microsoft Technical Computing and External Research for gifts in support of this research program. The authors would also like to thank Archer Batcheller, David Fearon, George Mood, Alberto Pepe, Katie Shilton, Elizabeth Rolando, and Laura Wynholds for their thoughtful comments on prior drafts of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jillian C. Wallis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borgman, C.L., Wallis, J.C. & Mayernik, M.S. Who’s Got the Data? Interdependencies in Science and Technology Collaborations. Comput Supported Coop Work 21, 485–523 (2012). https://doi.org/10.1007/s10606-012-9169-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10606-012-9169-z

Key words

Navigation