Skip to main content

Scientific Databases for Environmental Research

  • Chapter
  • First Online:
Ecological Informatics

Abstract

Databases are an important tool in the arsenal of environmental researchers. There are a rich variety of database types available to researchers for the management of their own data and for sharing data with others. However, using databases for research is not without challenges due to the characteristics of scientific data, which differ in terms of longevity, volume, diversity and ways they are used from many business applications. This chapter reviews some successful scientific databases, pathways for developing scientific data resources, and general classes of Database Management Systems (DBMS). It also provides an introduction to data modeling, normalization and how databases and data derived from databases can be interlinked to produce new scientific products.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Anderson C (2008) The end of theory: the data deluge makes the scientific method obsolete. Wired 16.07. http://www.wired.com/2008/06/pb-theory/. Accessed 15 Aug 2016

  • Benson DA, Cavanaugh M, Clark K et al (2013) GenBank. Nucleic Acids Res 41:D36–D42

    Article  CAS  Google Scholar 

  • Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28(1):235–242

    Article  CAS  Google Scholar 

  • Bobak AR (1997) Data modeling and design for today’s architectures. Artech House, Norwood, MA

    Google Scholar 

  • Borer ET, Seabloom EW, Jones MB et al (2009) Some simple guidelines for effective data management. Bull Ecol Soc Am 90(2):205–214

    Article  Google Scholar 

  • Brackett MH (1996) The data warehouse challenge: taming data chaos. Wiley, New York

    Google Scholar 

  • Campbell P (2009) Data’s shameful neglect. Nature 461:145–145

    Google Scholar 

  • Carpenter SR, Armbrust EV, Arzberger PW et al (2009) Accelerate synthesis in ecology and environmental sciences. Bioscience 59(8):699–701

    Article  Google Scholar 

  • Cinkosky MJ, Fickett JW, Gilna P et al (1991) Electronic data publishing and GenBank. Science 252(5010):1273–1277

    Article  CAS  Google Scholar 

  • Cole JR, Wang Q, Fish JA et al (2014) Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res 42:D633–D642

    Article  CAS  Google Scholar 

  • Colwell RK (1997) Biota: the biodiversity database manager. http://viceroy.eeb.uconn.edu/Biota/. Accessed 15 Aug 2016

  • Cook RB, Wei Y, Hook LA et al (2017) Preserve: protecting data for long-term use, Chapter 6. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, Heidelberg

    Google Scholar 

  • Costello MJ (2009) Motivating online publication of data. Bioscience 59(5):418–427

    Article  Google Scholar 

  • Data.gov (2016) The home of the U.S. government’s open data. https://www.data.gov. Accessed 15 Aug 2016

  • DataONE (2016) DataONE: data observation network for earth. https://dataone.org. Accessed 15 Aug 2016

  • Dryad (2016) Dryad. http://datadryad.org. Accessed 15 Aug 2016

  • Duke CS, Porter JH (2013) The ethics of data sharing and reuse in biology. Bioscience 63(6):483–489

    Article  Google Scholar 

  • Ecological Society of America (2016) VegBank. http://vegbank.org/vegbank/index.jsp. Accessed 15 Aug 2016

  • Federal Geographic Data Committee (FGDC) (1994) Content standards for digital spatial metadata (June 8 draft). Federal Geographic Data Committee, Washington, DC. http://geology.usgs.gov/tools/metadata/standard/940608.txt. Accessed 15 Aug 2016

    Google Scholar 

  • Fegraus EH, Andelman S, Jones MB et al (2005) Maximizing the value of ecological data with structured metadata: an introduction to Ecological Metadata Language (EML) and principles for metadata creation. Bull Ecol Soc Am 86(3):158–168

    Article  Google Scholar 

  • figshare (2016) figshare. https://figshare.com. Accessed 15 Aug 2016

  • Gilbert W (1991) Towards a paradigm shift in biology. Nature 349:99

    Article  CAS  Google Scholar 

  • Guenther R, McCallum S (2003) New metadata standards for digital resources: MODS and METS. Bull Am Soc Inf Sci Technol 29(2):12–15

    Article  Google Scholar 

  • Haerder T, Reuter A (1983) Principles of transaction-oriented database recovery. ACM Comput Surv 15(4):287–317. doi:10.1145/289.291

    Article  Google Scholar 

  • Hampton SE, Strasser CA, Tewksbury JJ et al (2013) Big data and the future of ecology. Frontiers Ecol Env 11(3):156–162

    Article  Google Scholar 

  • Harford T (2014) Big data: a big mistake? Significance 11(5):14–19

    Article  Google Scholar 

  • Hogan R (1990) A practical guide to data base design. Prentice Hall, Englewood Cliffs, NJ

    Google Scholar 

  • Holdren JP (2013) Increasing access to the results of federally funded scientific research. Memorandum, Office of Science and Technology Policy, Washington, DC. https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf. Accessed 15 Aug 2016

    Google Scholar 

  • Justice C, Bailey G, Maiden M et al (1995) Recent data and information system initiatives for remotely sensed measurements of the land surface. Remote Sens Environ 51(1):235–244

    Article  Google Scholar 

  • Keller W, Mitterbauer C, Wagner K (1998) Object-oriented data integration: running several generations of database technology in parallel. In: Chaudhri AB, Loomis M (eds) Object databases in practice. Prentice-Hall, New Jersey

    Google Scholar 

  • Kolb TL, Blukacz-Richards EA, Muir AM et al (2013) How to manage data to enhance their potential for synthesis, preservation, sharing, and reuse—a Great Lakes case study. Fisheries 38(2):52–64

    Article  Google Scholar 

  • Leavitt N (2010) Will NoSQL databases live up to their promise? Computer 43(2):12–14

    Article  Google Scholar 

  • Loomis MES, Chaudhri AB (1998) Object databases in practice. Prentice Hall, Upper Saddle River, NJ

    Google Scholar 

  • LTER (2016) LTER Network Data Portal. https://portal.lternet.edu. Accessed 15 Aug 2016

  • Madden S (2012) From databases to big data. IEEE Internet Comput 16(3):4–6

    Article  Google Scholar 

  • Magnuson JJ (1990) Long-term ecological research and the invisible present. Bioscience 40(7):495–501

    Article  Google Scholar 

  • Maroses M, Weiss S (1982) Computer and software systems. In: Lauff G, Gorentz J (eds) Data management at biological field stations. WK Kellogg Biological Field Station, Hickory Corners, MI, pp 23–30

    Google Scholar 

  • Mayer-Schönberger V, Cukier K (2013) Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, New York

    Google Scholar 

  • McCray AT, Gallagher ME (2001) Principles for digital library development. Commun ACM 44(5):48–54

    Article  Google Scholar 

  • McCreary D, Kelly A (2014) Making sense of NoSQL: a guide for managers and the rest of us. Manning, Shelter Island, NY

    Google Scholar 

  • Meeson BW, Strebel DE (1998) The publication analogy: a conceptual framework for scientific information systems. Remote Sens Rev 16(4):255–292

    Article  Google Scholar 

  • Michener WK (2017) Creating and managing metadata, Chapter 5. In: Recknagel F, Michener W (eds) Ecological informatics. Data management and knowledge discovery. Springer, Heidelberg

    Google Scholar 

  • Michener WK, Brunt JW, Helly JJ et al (1997) Nongeospatial metadata for the ecological sciences. Ecol Appl 7:330–342

    Article  Google Scholar 

  • Michener WK, Porter J, Servilla M et al (2011) Long term ecological research and information management. Ecol Inform 6(1):13–24

    Article  Google Scholar 

  • Michener WK, Allard S, Budden A et al (2012) Participatory design of DataONE—enabling cyberinfrastructure for the biological and environmental sciences. Ecol Inform 11:5–15

    Article  Google Scholar 

  • NCBI (2016) GenBank and WGS statistics. http://www.ncbi.nlm.nih.gov/genbank/statistics. Accessed 15 Aug 2016

  • NOAA (2016) National Oceanic and Atmospheric Administration National Centers for Environmental Information. https://www.ncei.noaa.gov/. Accessed 15 Aug 2016

  • Nogueras-Iso J, Zarazaga-Soria FJ, Lacasta J et al (2004) Metadata standard interoperability: application in the geographic information domain. Comput Environ Urban Syst 28(6):611–634

    Article  Google Scholar 

  • Parsons MA, Fox PA (2013) Is data publication the right metaphor? Data Sci J 12:WDS32–WDS46

    Google Scholar 

  • Peet RK, Lee MT, Jennings MD et al (2012) VegBank: a permanent, open-access archive for vegetation plot data. Biodiv Ecol 4:233–241

    Article  Google Scholar 

  • Pfaltz J (1990) Differences between commercial and scientific data. In: French JC, Jones AK, Pfaltz JL (eds) Report of the first invitational NSF workshop on scientific database management, technical report 90-21. Department of Computer Science, University of Virginia

    Google Scholar 

  • Porter JH, Callahan JT (1994) Circumventing a dilemma: historical approaches to data sharing in ecological research. In: Michener WK, Stafford S, Brunt JW (eds) Environmental information management and analysis: ecosystem to global scales. Taylor and Francis, London, pp 193–203

    Google Scholar 

  • Porter JH, Hanson PC, Lin CC (2012) Staying afloat in the sensor data deluge. Trends Ecol Evol 27(2):121–129

    Article  Google Scholar 

  • Reichman OJ, Jones MB, Schildhauer MP (2011) Challenges and opportunities of open data in ecology. Science 331(6018):703–705. doi:10.1126/science.1197962

    Article  CAS  Google Scholar 

  • Robbins RJ (1994) Biological databases: a new scientific literature. Publ Res Q 10:3–27

    Article  Google Scholar 

  • Robbins RJ (1995) Information infrastructure. IEEE Eng Med Biol Mag 14(6):746–759

    Article  Google Scholar 

  • Roche DG, Lanfear R, Binning SA et al (2014) Troubleshooting public data archiving: suggestions to increase participation. PLoS Biol 12(1):e1001779

    Article  Google Scholar 

  • Specify Software Project (2016) Specify Collections Management Software (Version 6.6.04). http://specifyx.specifysoftware.org. Accessed 16 Aug 2016

  • Star SL, Ruhleder K (1996) Steps toward an ecology of infrastructure: design and access for large information spaces. Inf Syst Res 7(1):111–134

    Article  Google Scholar 

  • Strebel DE, Meeson BW, Nelson AK (1994) Scientific information systems: a conceptual framework. In: Michener WK, Stafford S, Brunt JW (eds) Environmental information management. Taylor and Francis, London, pp 59–85

    Google Scholar 

  • Strebel DE, Landis DR, Huemmrich KF et al (1998) The FIFE data publication experiment. J Atmos Sci 55(7):1277–1283

    Article  Google Scholar 

  • Sullivan D (2015) NoSQL for mere mortals. Addison-Wesley, Hoboken, NJ

    Google Scholar 

  • UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212

    Article  Google Scholar 

  • White HC, Carrier S, Thompson A et al (2008) The Dryad data repository. In: International conference on Dublin Core and metadata applications-metadata for semantic and social applications, 22–26 September 2008, Berlin (DC-2008). Humboldt-Universität zu Berlin, Berlin

    Google Scholar 

  • Whitlock MC, McPeek MA, Rausher MD et al (2010) Data archiving. Am Nat 175:145–146

    Article  Google Scholar 

  • Wieczorek J, Bloom D, Guralnick R et al (2012) Darwin Core: an evolving community-developed biodiversity data standard. PLoS One 7(1):e29715

    Article  CAS  Google Scholar 

  • Yarmey L, Baker KS (2013) Towards standardization: a participatory framework for scientific standard-making. Int J Digital Curation 8(1):157–172

    Article  Google Scholar 

Download references

Acknowledgments

This chapter benefited immensely from conversations with information managers and scientists in the LTER network and DataONE, Robert Robbins, William K. Michener, Susan Stafford, Bruce P. Hayden, Dick Olson and John Pfaltz. It was supported by NSF grant DEB-1237733 to the University of Virginia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John H. Porter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Porter, J.H. (2018). Scientific Databases for Environmental Research. In: Recknagel, F., Michener, W. (eds) Ecological Informatics. Springer, Cham. https://doi.org/10.1007/978-3-319-59928-1_3

Download citation

Publish with us

Policies and ethics