Abstract
Databases are an important tool in the arsenal of environmental researchers. There are a rich variety of database types available to researchers for the management of their own data and for sharing data with others. However, using databases for research is not without challenges due to the characteristics of scientific data, which differ in terms of longevity, volume, diversity and ways they are used from many business applications. This chapter reviews some successful scientific databases, pathways for developing scientific data resources, and general classes of Database Management Systems (DBMS). It also provides an introduction to data modeling, normalization and how databases and data derived from databases can be interlinked to produce new scientific products.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson C (2008) The end of theory: the data deluge makes the scientific method obsolete. Wired 16.07. http://www.wired.com/2008/06/pb-theory/. Accessed 15 Aug 2016
Benson DA, Cavanaugh M, Clark K et al (2013) GenBank. Nucleic Acids Res 41:D36–D42
Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28(1):235–242
Bobak AR (1997) Data modeling and design for today’s architectures. Artech House, Norwood, MA
Borer ET, Seabloom EW, Jones MB et al (2009) Some simple guidelines for effective data management. Bull Ecol Soc Am 90(2):205–214
Brackett MH (1996) The data warehouse challenge: taming data chaos. Wiley, New York
Campbell P (2009) Data’s shameful neglect. Nature 461:145–145
Carpenter SR, Armbrust EV, Arzberger PW et al (2009) Accelerate synthesis in ecology and environmental sciences. Bioscience 59(8):699–701
Cinkosky MJ, Fickett JW, Gilna P et al (1991) Electronic data publishing and GenBank. Science 252(5010):1273–1277
Cole JR, Wang Q, Fish JA et al (2014) Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res 42:D633–D642
Colwell RK (1997) Biota: the biodiversity database manager. http://viceroy.eeb.uconn.edu/Biota/. Accessed 15 Aug 2016
Cook RB, Wei Y, Hook LA et al (2017) Preserve: protecting data for long-term use, Chapter 6. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, Heidelberg
Costello MJ (2009) Motivating online publication of data. Bioscience 59(5):418–427
Data.gov (2016) The home of the U.S. government’s open data. https://www.data.gov. Accessed 15 Aug 2016
DataONE (2016) DataONE: data observation network for earth. https://dataone.org. Accessed 15 Aug 2016
Dryad (2016) Dryad. http://datadryad.org. Accessed 15 Aug 2016
Duke CS, Porter JH (2013) The ethics of data sharing and reuse in biology. Bioscience 63(6):483–489
Ecological Society of America (2016) VegBank. http://vegbank.org/vegbank/index.jsp. Accessed 15 Aug 2016
Federal Geographic Data Committee (FGDC) (1994) Content standards for digital spatial metadata (June 8 draft). Federal Geographic Data Committee, Washington, DC. http://geology.usgs.gov/tools/metadata/standard/940608.txt. Accessed 15 Aug 2016
Fegraus EH, Andelman S, Jones MB et al (2005) Maximizing the value of ecological data with structured metadata: an introduction to Ecological Metadata Language (EML) and principles for metadata creation. Bull Ecol Soc Am 86(3):158–168
figshare (2016) figshare. https://figshare.com. Accessed 15 Aug 2016
Gilbert W (1991) Towards a paradigm shift in biology. Nature 349:99
Guenther R, McCallum S (2003) New metadata standards for digital resources: MODS and METS. Bull Am Soc Inf Sci Technol 29(2):12–15
Haerder T, Reuter A (1983) Principles of transaction-oriented database recovery. ACM Comput Surv 15(4):287–317. doi:10.1145/289.291
Hampton SE, Strasser CA, Tewksbury JJ et al (2013) Big data and the future of ecology. Frontiers Ecol Env 11(3):156–162
Harford T (2014) Big data: a big mistake? Significance 11(5):14–19
Hogan R (1990) A practical guide to data base design. Prentice Hall, Englewood Cliffs, NJ
Holdren JP (2013) Increasing access to the results of federally funded scientific research. Memorandum, Office of Science and Technology Policy, Washington, DC. https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf. Accessed 15 Aug 2016
Justice C, Bailey G, Maiden M et al (1995) Recent data and information system initiatives for remotely sensed measurements of the land surface. Remote Sens Environ 51(1):235–244
Keller W, Mitterbauer C, Wagner K (1998) Object-oriented data integration: running several generations of database technology in parallel. In: Chaudhri AB, Loomis M (eds) Object databases in practice. Prentice-Hall, New Jersey
Kolb TL, Blukacz-Richards EA, Muir AM et al (2013) How to manage data to enhance their potential for synthesis, preservation, sharing, and reuse—a Great Lakes case study. Fisheries 38(2):52–64
Leavitt N (2010) Will NoSQL databases live up to their promise? Computer 43(2):12–14
Loomis MES, Chaudhri AB (1998) Object databases in practice. Prentice Hall, Upper Saddle River, NJ
LTER (2016) LTER Network Data Portal. https://portal.lternet.edu. Accessed 15 Aug 2016
Madden S (2012) From databases to big data. IEEE Internet Comput 16(3):4–6
Magnuson JJ (1990) Long-term ecological research and the invisible present. Bioscience 40(7):495–501
Maroses M, Weiss S (1982) Computer and software systems. In: Lauff G, Gorentz J (eds) Data management at biological field stations. WK Kellogg Biological Field Station, Hickory Corners, MI, pp 23–30
Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, New York
McCray AT, Gallagher ME (2001) Principles for digital library development. Commun ACM 44(5):48–54
McCreary D, Kelly A (2014) Making sense of NoSQL: a guide for managers and the rest of us. Manning, Shelter Island, NY
Meeson BW, Strebel DE (1998) The publication analogy: a conceptual framework for scientific information systems. Remote Sens Rev 16(4):255–292
Michener WK (2017) Creating and managing metadata, Chapter 5. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, Heidelberg
Michener WK, Brunt JW, Helly JJ et al (1997) Nongeospatial metadata for the ecological sciences. Ecol Appl 7:330–342
Michener WK, Porter J, Servilla M et al (2011) Long term ecological research and information management. Ecol Inform 6(1):13–24
Michener WK, Allard S, Budden A et al (2012) Participatory design of DataONE—enabling cyberinfrastructure for the biological and environmental sciences. Ecol Inform 11:5–15
NCBI (2016) GenBank and WGS statistics. http://www.ncbi.nlm.nih.gov/genbank/statistics. Accessed 15 Aug 2016
NOAA (2016) National Oceanic and Atmospheric Administration National Centers for Environmental Information. https://www.ncei.noaa.gov/. Accessed 15 Aug 2016
Nogueras-Iso J, Zarazaga-Soria FJ, Lacasta J et al (2004) Metadata standard interoperability: application in the geographic information domain. Comput Environ Urban Syst 28(6):611–634
Parsons MA, Fox PA (2013) Is data publication the right metaphor? Data Sci J 12:WDS32–WDS46
Peet RK, Lee MT, Jennings MD et al (2012) VegBank: a permanent, open-access archive for vegetation plot data. Biodiv Ecol 4:233–241
Pfaltz J (1990) Differences between commercial and scientific data. In: French JC, Jones AK, Pfaltz JL (eds) Report of the first invitational NSF workshop on scientific database management, technical report 90-21. Department of Computer Science, University of Virginia
Porter JH, Callahan JT (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In: Michener WK, Stafford S, Brunt JW (eds) Environmental information management and analysis: ecosystem to global scales. Taylor and Francis, London, pp 193–203
Porter JH, Hanson PC, Lin CC (2012) Staying afloat in the sensor data deluge. Trends Ecol Evol 27(2):121–129
Reichman OJ, Jones MB, Schildhauer MP (2011) Challenges and opportunities of open data in ecology. Science 331(6018):703–705. doi:10.1126/science.1197962
Robbins RJ (1994) Biological databases: a new scientific literature. Publ Res Q 10:3–27
Robbins RJ (1995) Information infrastructure. IEEE Eng Med Biol Mag 14(6):746–759
Roche DG, Lanfear R, Binning SA et al (2014) Troubleshooting public data archiving: suggestions to increase participation. PLoS Biol 12(1):e1001779
Specify Software Project (2016) Specify Collections Management Software (Version 6.6.04). http://specifyx.specifysoftware.org. Accessed 16 Aug 2016
Star SL, Ruhleder K (1996) Steps toward an ecology of infrastructure: design and access for large information spaces. Inf Syst Res 7(1):111–134
Strebel DE, Meeson BW, Nelson AK (1994) Scientific information systems: a conceptual framework. In: Michener WK, Stafford S, Brunt JW (eds) Environmental information management. Taylor and Francis, London, pp 59–85
Strebel DE, Landis DR, Huemmrich KF et al (1998) The FIFE data publication experiment. J Atmos Sci 55(7):1277–1283
Sullivan D (2015) NoSQL for mere mortals. Addison-Wesley, Hoboken, NJ
UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212
White HC, Carrier S, Thompson A et al (2008) The Dryad data repository. In: International conference on Dublin Core and metadata applications-metadata for semantic and social applications, 22–26 September 2008, Berlin (DC-2008). Humboldt-Universität zu Berlin, Berlin
Whitlock MC, McPeek MA, Rausher MD et al (2010) Data archiving. Am Nat 175:145–146
Wieczorek J, Bloom D, Guralnick R et al (2012) Darwin Core: an evolving community-developed biodiversity data standard. PLoS One 7(1):e29715
Yarmey L, Baker KS (2013) Towards standardization: a participatory framework for scientific standard-making. Int J Digital Curation 8(1):157–172
Acknowledgments
This chapter benefited immensely from conversations with information managers and scientists in the LTER network and DataONE, Robert Robbins, William K. Michener, Susan Stafford, Bruce P. Hayden, Dick Olson and John Pfaltz. It was supported by NSF grant DEB-1237733 to the University of Virginia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Porter, J.H. (2018). Scientific Databases for Environmental Research. In: Recknagel, F., Michener, W. (eds) Ecological Informatics. Springer, Cham. https://doi.org/10.1007/978-3-319-59928-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-59928-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59926-7
Online ISBN: 978-3-319-59928-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)