Environmental Management

, Volume 53, Issue 5, pp 883–893 | Cite as

Why is Data Sharing in Collaborative Natural Resource Efforts so Hard and What can We Do to Improve it?

Article

Abstract

Increasingly, research and management in natural resource science rely on very large datasets compiled from multiple sources. While it is generally good to have more data, utilizing large, complex datasets has introduced challenges in data sharing, especially for collaborating researchers in disparate locations (“distributed research teams”). We surveyed natural resource scientists about common data-sharing problems. The major issues identified by our survey respondents (n = 118) when providing data were lack of clarity in the data request (including format of data requested). When receiving data, survey respondents reported various insufficiencies in documentation describing the data (e.g., no data collection description/no protocol, data aggregated, or summarized without explanation). Since metadata, or “information about the data,” is a central obstacle in efficient data handling, we suggest documenting metadata through data dictionaries, protocols, read-me files, explicit null value documentation, and process metadata as essential to any large-scale research program. We advocate for all researchers, but especially those involved in distributed teams to alleviate these problems with the use of several readily available communication strategies including the use of organizational charts to define roles, data flow diagrams to outline procedures and timelines, and data update cycles to guide data-handling expectations. In particular, we argue that distributed research teams magnify data-sharing challenges making data management training even more crucial for natural resource scientists. If natural resource scientists fail to overcome communication and metadata documentation issues, then negative data-sharing experiences will likely continue to undermine the success of many large-scale collaborative projects.

Keywords

Natural resource management Metadata Data sharing Data flow diagrams Distributed teams Data transfer 

Notes

Acknowledgments

This work was supported by the Integrated Status and Effectiveness Monitoring Program (funded by Bonneville Power Administration (2003-017-00), the National Research Council and the Northwest Fisheries Science Center (NOAA-Fisheries). Chris Jordan, Steve Rentmeester, and Andy Albaugh provided valuable insight and experiences in the development of this manuscript.

References

  1. Armstrong DJ, Cole P (2002) Managing distances and differences in geographically distributed work groups. In: Hinds P, Kiesler S (eds) Distributed work. The MIT Press, CambridgeGoogle Scholar
  2. Beier U, Degerman E, Melcher A, Rogers C, Wirlöf H (2007) Processes of collating a European fisheries database to meet the objectives of the European Union Water Framework Directive. Fish Manage Ecol 14:407–416CrossRefGoogle Scholar
  3. Booch G, Rumbaugh J, Jacobson I (2005) Unified modeling language user guide, the Addison-Wesley object technology series. Addison-Wesley Professional, BostonGoogle Scholar
  4. Borer ET, Seabloom EW, Jones MB, Schildhauer M (2009) Some simple guidelines for effective data management. Bull Ecol Soc Am 90:205–214CrossRefGoogle Scholar
  5. Borgman C, Wallis J, Enyedy N (2007) Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. Int J Digit Libr 7:17–30CrossRefGoogle Scholar
  6. Brunt J, Michener W (2009) The resource discovery initiative for field stations: enhancing data management at North American biological field stations. Bioscience 59:482–487CrossRefGoogle Scholar
  7. Ellison AM (2010) Repeatability and transparency in ecological research. Ecology 91:2536–2539CrossRefGoogle Scholar
  8. Ellison AM, Osterweil LJ, Hadley JL, Wise A, Boose E, Clarke L, Foster DR, Hanson A, Jensen D, Kuzeja P, Riseman E, Schultz H (2006) Analytic webs support the synthesis of ecological datasets. Ecology 87:1345–1358CrossRefGoogle Scholar
  9. Federal Geographic Data Committee (1999) Content Standard for Digital Geospatial Data, Part 1, Biological Data Profile. Federal Geographic Data Committee and USGS Biological Resources Division. Report no. FGDC-STD-001.1-1999Google Scholar
  10. Hampton SE, Tewksbury JJ, Strasser CA (2012) Ecological data in the information age. Front Ecol Environ 10:59CrossRefGoogle Scholar
  11. Hernandez RR, Mayernik MS, Murphy-Mariscal ML, Allen MF (2012) Advanced technologies and data management practices in environmental science: lessons from academia. Bioscience 62:1067–1076CrossRefGoogle Scholar
  12. Hinds P, Kiesler S (2002) Distributed work. MIT Press, CambridgeGoogle Scholar
  13. Jones MB, Schildhauer MP, Reichman OJ, Bowers S (2006) The new bioinformatics: integrating ecological data from the gene to the biosphere. Annu Rev Ecol Evol Syst 37:519–544CrossRefGoogle Scholar
  14. Kiesler S, Cummings J (2002) What do we know about proximity and distance in work groups? In: Hinds PJ, Kiesler S (eds) Distributed work. MIT Press, Cambridge, pp 57–80Google Scholar
  15. Kolb TL, Blukacz-Richards EA, Muir AM, Claramunt RM, Koops MA, Taylor WW, Sutton TM, Arts MT, Bissel E (2013) How to manage data to enhance their potential for synthesis, preservation, sharing, and reuse-a great lakes case study. Fisheries 38:52–64CrossRefGoogle Scholar
  16. Ludäscher B, Altintas I, Bowers S, Cummings J, Critchlow T, Deelman E, De Roure D, Freire J, Goble C, Jones M, Klasky S, McPhillips T, Podhorszki N, Silva C, Taylor I, Vouk M (2009) Scientific data management: challenges, existing technology, and deployment, computational science series. In: Shoshani, Rotem (eds) Scientific process automation and workflow management. Chapman & Hall/CRC, WashingtonCrossRefGoogle Scholar
  17. Madin J, Bowers S, Schildhauer M, Jones M (2008) Advancing ecological research with ontologies. Trends Ecol Evol 23(3):159–168CrossRefGoogle Scholar
  18. Mager RF, Pipe P (1997) Analyzing performance problems, or, you really oughta wanna: how to figure out why people aren’t doing what they should be, and what to do about it, vol 3. Center for Effective Performance, Atlanta, GAGoogle Scholar
  19. McLaughlin RL, Carl LM, Middel T, Ross M, Noakes DLG, Hayes DB, Baylis JR (2001) Potentials and pitfalls of integrating data from diverse sources: lessons from a historical database for Great Lakes stream fishes. Fisheries 26:14–23CrossRefGoogle Scholar
  20. Michener WK, Jones MB (2012) Ecoinformatics: supporting ecology as a data-intensive science. Trends Ecol Evol 27:85–93CrossRefGoogle Scholar
  21. Nardi BA, Whittaker S (2002) The place of face-to-face communication in distributed work. In: Hinds PJ, Kiesler S (eds) Distributed work. MIT Press, Cambridge, pp 83–112Google Scholar
  22. Nelson B (2009) Empty archives. Nature 46:160–163CrossRefGoogle Scholar
  23. Oakley KL, Thomas LP, Fancy SG (2003) Guidelines for long-term monitoring protocols. Wildl Soc Bull 31:1000–1003Google Scholar
  24. Pikitch EK, Santora C, Babcock EA, Bakun A, Bonfil R, Conover DO, Dayton P, Doukakis P, Fluharty D, Heneman B, Houde ED, Link J, Livingston PA, Mangel M, McAllister MK, Pope J, Sainsbury KJ (2004) Ecosystem-based fishery management. Science 305:346–347CrossRefGoogle Scholar
  25. Quinn M, Alexander S (2008) Information technology and the protection of biodiversity in protected areas. In: Hanna KS, Clark DA, Slowcombe S (eds) Transforming parks and protected areas: policy and governance in a changing world. Routledge, New York, pp 62–84Google Scholar
  26. Rentmeester S (ed) (2010) Regional Guidance on Metadata for Environmental Data. PNAMP Series Report No. 2010-001. Cook, WA: Pacific Northwest Aquatic Monitoring Partnership. http://www.pnamp.org/document/2771
  27. Robertson G (2008) Long-term ecological research: re-inventing network science. Front Ecol Environ 6(5):281CrossRefGoogle Scholar
  28. Schmidt B (2009) Considerations for regional data collection, sharing and exchange. StreamNet, p 27. ftp://ftp.streamnet.org/pub/streamnet/projman_files/Data_Sharing_Guide_2009-06-01.pdf
  29. Seifert J (2004) Data mining and the search for security: challenges for connecting the dots and databases. Gov Inf Q 21:461–480CrossRefGoogle Scholar
  30. Shaw M, Subramaniam C, Tan G, Welge M (2001) Knowledge management and data mining for marketing. Decis Support Syst 31:127–137CrossRefGoogle Scholar
  31. Spengler S (2000) Bioinformatics in the information age. Science 287:1221–1223CrossRefGoogle Scholar
  32. Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M (2011) Data sharing by scientists: practices and perceptions. PLoS One 6(6):e21101CrossRefGoogle Scholar
  33. Turnhout E, Boonman-Berson S (2011) Databases, scaling practices, and the globalization of biodiversity. Ecol Soc 16(1):35Google Scholar
  34. Vogeli C, Yucel R, Bendavid E, Jones LM, Anderson MS, Louis KS, Campbell EG (2006) Data withholding and the next generation of scientists: results of a national survey. Acad Med 81:128–136CrossRefGoogle Scholar
  35. Wallis J, Mayernik M, Pepe A, Borgman C (2008) An exploration of the life cycle of eScience collaboratory data. iConference 2008. Los Angeles, CAGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.South Fork Research, Inc.North BendUSA
  2. 2.Signal to Noise ConsultingSanta MonicaUSA
  3. 3.Northwest Fisheries Science CenterNOAA-FisheriesSeattleUSA

Personalised recommendations