Navigating Oceans of Data

  • David Maier
  • V. M. Megler
  • António M. Baptista
  • Alex Jaramillo
  • Charles Seaton
  • Paul J. Turner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7338)

Abstract

Some science domains have the advantage that the bulk of the data comes from a single source instrument, such as a telescope or particle collider. More commonly, big data implies a big variety of data sources. For example, the Center for Coastal Margin Observation and Prediction (CMOP) has multiple kinds of sensors (salinity, temperature, pH, dissolved oxygen, chlorophyll A & B) on diverse platforms (fixed station, buoy, ship, underwater robot) coming in at different rates over various spatial scales and provided at several quality levels (raw, preliminary, curated). In addition, there are physical samples analyzed in the lab for biochemical and genetic properties, and simulation models for estuaries and near-ocean fluid dynamics and biogeochemical processes. Few people know the entire range of data holdings, much less their structures and how to access them. We present a variety of approaches CMOP has followed to help operational, science and resource managers locate, view and analyze data, including the Data Explorer, Data Near Here, and topical “watch pages.” From these examples, and user experiences with them, we draw lessons about supporting users of collaborative “science observatories” and remaining challenges.

Keywords

environmental data spatial-temporal data management ocean observatories 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Archer, C., et al.: Fault detection for salinity sensors in the Columbia estuary. Water Resources Research 39(3), 1060 (2003)CrossRefGoogle Scholar
  2. 2.
    Burla, M., et al.: Seasonal and Interannual Variability of the Columbia River Plume: A Perspective Enabled by Multiyear Simulation Databases. Journal of Geophysical Research 115(C2), C00B16 (2010) Google Scholar
  3. 3.
    Burla, M.: The Columbia River Estuary and Plume: Natural Variability, Anthropogenic Change and Physical Habitat for Salmon. Ph.D. Dissertation. Beaverton, OR: Division of Environmental and Biomolecular Sys-tems, Oregon Health & Science University (2009) Google Scholar
  4. 4.
    Cornillon, P., et al.: OPeNDAP: Accessing Data in a Distributed, Heterogeneous Environment. Data Science Journal 2, 164–174 (2003)CrossRefGoogle Scholar
  5. 5.
    Domenico, B., et al.: Thematic Real-time Environmental Distributed Data Services (THREDDS): Incorporating Interactive Analysis Tools into NSDL. Journal of Digital Information 2(4) (2006) Google Scholar
  6. 6.
    Ghindilis, A.L., et al.: Real-Time Biosensor Platform: Fully Integrated Device for Impedimetric Assays. ECS Transactions 33(8), 59–68 (2010)CrossRefGoogle Scholar
  7. 7.
    Gonzalez, H., et al.: Google Fusion Tables: Data Management, Integration and Collaboration in the Cloud. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 175–180. ACM, New York (2010)CrossRefGoogle Scholar
  8. 8.
    Haddock, T.: Submersible Microflow Cytometer for Quantitative Detection of Phytoplankton (2009), https://ehb8.gsfc.nasa.gov/sbir/docs/public/recent_selections/SBIR_09_P2/SBIR_09_P2_094226/briefchart.pdf
  9. 9.
    Herfort, L., et al.: Myrionecta rubra (Mesodinium rubrum) bloom initiation in the Columbia River Estuary. Estuarine, Coastal and Shelf Science (2011) Google Scholar
  10. 10.
    Megler, V.M., Maier, D.: Finding Haystacks with Needles: Ranked Search for Data Using Geospatial and Temporal Characteristics. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 55–72. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Open Geospatial Consortium, Inc.: OpenGIS® Web Map Server Imple-mentation Specification Version: 1.3.0 (2006) Google Scholar
  12. 12.
    Plant, J., et al.: NH 4-Digiscan: an in situ and laboratory ammonium analyzer for estuarine, coastal and shelf waters. Limnology and Oceanography: Methods 7, 144–156 (2009)CrossRefGoogle Scholar
  13. 13.
    Rew, R., Davis, G.: NetCDF: an interface for scientific data access. IEEE Computer Graphics and Applications 10(4), 76–82 (1990)CrossRefGoogle Scholar
  14. 14.
    Roegner, G.C., et al.: Coastal Upwelling Supplies Oxygen-Depleted Water to the Columbia River Estuary. PLoS One 6(4), e18672 (2011) Google Scholar
  15. 15.
    Szalay, A.S., et al.: Designing and mining multi-terabyte astronomy archives: the Sloan Digital Sky Survey. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 29(2), pp. 451–462 (2000) Google Scholar
  16. 16.
    Climatological Atlas, Center for Coastal Margin Observation & Prediction, http://www.stccmop.org/datamart/virtualcolumbiariver/simulationdatabases/climatologicalatlas

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • David Maier
    • 1
  • V. M. Megler
    • 1
  • António M. Baptista
    • 2
  • Alex Jaramillo
    • 2
  • Charles Seaton
    • 2
  • Paul J. Turner
    • 2
  1. 1.Computer Science DepartmentPortland State UniversityUSA
  2. 2.Center for Coastal Margin Observation & PredicationOregon Health & Science UniversityUSA

Personalised recommendations