Abstract
By broad consensus, Open Data presents great value. However, beyond that simple statement, there are a number of complex, and sometimes contentious, issues that the science community must address. In this review, we examine the current state of the core issues of Open Data with the unique perspective and use cases of the ocean science community: interoperability; discovery and access; quality and fitness for purpose; and sustainability. The topics of Governance and Data Publication are also examined in detail. Each of the areas covered are, by themselves, complex and the approaches to the issues under consideration are often at odds with each other. Any comprehensive policy on Open Data will require compromises that are best resolved by broad community input. In the final section of the review, we provide recommendations that serve as a starting point for these discussions.
Similar content being viewed by others
Notes
e.g., file synchronization protocols like rsync could be used.
References
Allcock W, Bresnahan J, Kettimuthu R, Link M, Dumitrescu C, Raicu J, Foster I (2005) The Globus striped GridFTP framework and server. In Proceedings of the 2005 ACM/IEEE conference on Supercomputing p 54 I.E. Computer Society, 2005
Allinson J (2006) OAIS as a reference model for repositories: an evaluation. Report, UKOLN, University of Bath. http://eprints.whiterose.ac.uk/3464/. Accessed 19 Sept 2014
Altman M, King G (2007) A proposed standard for the scholarly citation of quantitative data. D-Lib Magazine, 13(3/4). http://www.dlib.org/dlib/march07/altman/03altman.html. Accessed 19 Sept 2014
Australian Government (2009) Government 2.0 task force report. http://www.finance.gov.au/publications/gov20taskforcereport/doc/Government20TaskforceReport.pdf. Accessed 16 Sept 2014
Ball A, Duke M (2012) How to cite data sets and link to publications. Edinburgh, UK: Digital Curation Centre. http://www.dcc.ac.uk/webfm_send/525. Accessed 16 Sept 2014
BCO-DMO (2014) Biological & chemical oceanography data management office. http://www.bco-dmo.org. Accessed 16 Sept 2014
BDJ (2014) Biodiversity data journal. http://biodiversitydatajournal.com. Accessed 16 Sept 2014
Berners-Lee T (2009) The next web. TED 2009 Conference. http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html. Accessed 16 Sept 2014
Best B, Halpin P, Fujioka E, Read A, Qain S, Hazen L, Schick R (2007) Geospatial web services within a scientific workflow: predicting marine mammal habitats in a dynamic environment. Ecol Inform 2(3):210–223. doi:10.1016/j.ecoinf.2007.07.007
Bjork B, Solomon D, (2012) Pricing principles used by Scholarly Open Access Publishers. Learn Publ 25(3):132–137. doi:10.1087/20120207
BODC (2014) Published data library. https://www.bodc.ac.uk/data/published_data_library. Accessed 16 Sept 2014
Borgman CL (2012) The conundrum of sharing research data. J Assoc Inf Sci Technol 63(6):1059–1078. doi:10.1002/asi.22634
Braunschweig K, Eberius J, Thiele M, Lehner W (2012) The state of open data—limits of current open data platforms. In: Mille A, Gandon FL, Misselis J, Rabinovich M, Staab S (eds) Proceedings of the 21st World Wide Web Conference 2012, (WWW 2012), Lyon, France
Busse S, Kutsche RD, Leser U, Weber H (1999) Federated information systems: concepts, terminology and architectures. Tech. Rep., Technical University Berlin
Carpenter S et al (2009) Accelerate synthesis in ecology and environmental sciences. Bioscience 59(8):699–701. doi:10.1525/bio.2009.59.8.11
Cocco M (2012) Research infrastructure and e-science for data and observatories on earthquakes, volcanoes, surface dynamics and tectonics. ICRI2012, International conference on research infrastructures. http://ec.europa.eu/research/infrastructures/pdf/workshop_october_2011/16_esfri_epos_cocco.pdf. Accessed 16 Sept 2014
CODATA (2009) Data Sci J. http://www.codata.org/dsj. Accessed 16 Sept 2014
Copernicus (2014) Copernicus: The European earth observation programme. http://www.copernicus.eu. Accessed 16 Sept 2014
Copernicus Publications (2014) Earth system science data: the data publishing journal. http://earth-system-science-data.net. Accessed 16 Sept 2014
Costello M (2009) Motivating online publication of data. Bioscience 59(5):418–427. doi:10.1525/bio.2009.59.5.9
Costello M, Wieczorek J (2013) Biological Conversation 173:68–73. doi:10.1016/j.biocon.2013.10.018
Costello M, Bouchet P, Boxshall G, Fauchald K, Gordon D et al (2013a) Global coordination and standardisation in marine biodiversity through the world register of marine species (WoRMS) and related databases. PLoS One 8(1):e51629. doi:10.1371/journal.pone.0051629
Costello M, Michelner WK, Gahegan M, Zhang Z-Q, Bourne PE (2013b) Biodiversity data should be published, cited, and peer reviewed. Trends Ecol Evol 28(8):454–461
Costello M, Appeltrans W, Bailly N, Berendsohn W, Jong Y, Edwards M, Froese R, Huettmann F, Los W, Mess J, Segers H, Bisby F (2014) Strategies for the sustainability of online open-access biodiversity databases. Biol Conv 173:155–165
Cragin MH, Palmer CL, Carlson JR, Witt M (2010) Data sharing, small science and institutional repositories. Philos Trans R Soc A 368:4023–4038
Creative Commons (2014) Creative commons license. http://creativecommons.org/licenses. Accessed 16 Sept 2014
CUAHSI (2013) CUAHSI Water Data Center. http://wdc.cuahsi.org. Accessed 16 Sept 2014
Datacite (2014) Datacite. https://www.datacite.org. Accessed 16 Sept 2014
DataNet (2014) DataNet Federation Consortium—collaboration environments for data drivenscience. http://datafed.org. Accessed 16 Sept 2014
DataONE (2014) NSF data observation network for earth (DataONE). https://www.dataone.org/about. Accessed 16 Sept 2014
DOI (2014) Digital object identifier. http://www.doi.org. Accessed 16 Sept 2014
Dryad (2014) Dryad digital repository. http://datadryad.org. Accessed 16 Sept 2014
DuraSpace (2014) DSpace. http://www.dspace.org. Accessed 16 Sept 2014
Dusterhus A, Hense A (2014) Automated quality evaluation for a more effective data peer review. Data Sci J 13:67–78
Earth Observations (2013) GEO BON—biodiversity observation network http://www.earthobservations.org/geobon.shtml. Accessed 16 Sept 2014
Environmental Systems Research Institute (2014) Living Atlas of the World. http://doc.arcgis.com/en/living-atlas. Accessed 21 September 2014
ESIP (2012) Federation of earth science information partners. http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations. Accessed 16 Sept 2014
ESS (2014) Earth and Space Science. http://agupubs.onlinelibrary.wiley.com/agu/journal/10.1002/%28ISSN%292333-5084/. Accessed 17 Sept 2014
EU (2006) Communication from the Commission to the Council and the European Parliament: Interoperability for pan-European government services. http://www.epsos.eu/uploads/tx_epsosfileshare Communication-on-Interoperability_01.pdf. Accessed 16 Sept 2014
EU (2013) Guidelines on open access to scientific publications and research data in Horizon 2020. Version one. http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf. Accessed 16 Sept 2014
European Commission (2011) Digital agenda: turning government data into gold. Press release. http://europa.eu/rapid/press-release_IP-11-1524_en.htm. Accessed 16 Sept 2014
F1000Research (2014) http://f1000research.com. Accessed 16 Sept 2014
FGDC (2014) National spatial data infrastructure (NSDI).http://www.fgdc.gov/nsdi/nsdi.html. Accessed 16 Sept 2014
Figshare (2014) Figshare. http://figshare.com. Accessed 16 Sept 2014
Folkman M, Liao L, Jarecke P (2001) EO-1/Hyperion hyperspectral imager design, development, characterization, and calibration, Proc. SPIE 4151, Hyperspectral remote sensing of the land and atmosphere, 40 (February 8, 2001); doi:10.1117/12.417022
Force II (2013) Force II: The future of research communication and scholarship, joint declaration of data citation principles. http://www.force11.org/datacitation. Accessed 16 Sept 2014
FSF (2014) Free software foundation. http://www.fsf.org. Accessed 16 Sept 2014
Gallagher J, Potter N, Sgouros T, Hankin S, Flierl G (2007) The data access protocol—DAP 2.0. NASA ESE-RFC-004.1.1. https://earthdata.nasa.gov/our-community/esdswg/standards-process-spg/rfc/esds-rfc-004-dap-20. Accessed 16 Sept 2014
GBIF (2014) Global biodiversity information facility. http://www.gbif.org. Accessed 16 Sept 2014
GDC (2014) Geological Data Center, Scripps Institution of Oceanography. http://gdc.ucsd.edu/. Accessed 17 Sept 2014
GEO/CEOS (2008) GEO/CEOS workshop on quality assurance of calibration & validation processes: Establishing an operational framework. http://qa4eo.org/workshop_washington08.html. Accessed 16 Sept 2014
GeoViQua (2007) GeoViQua: QUAlity aware VIsualization for the global earth observation system of systems. http://www.geoviqua.org. Accessed 16 Sept 2014
Grassle JF (2000) The Ocean Biogeographic Information System (OBIS): an on-line, worldwide atlas for accessing, modeling and mapping marine biological data in a multidimensional geographic context. Oceanography 13(3):5–7. doi:10.5670/oceanog.2000.01
Guess A (2013) Japan embraces open data, launches multiple open projects. http://semanticweb.com/japan-embraces-open-data-launches-multiple-open-projects_b35158. Accessed 16 Sept 2014
Hankin S, Blower J, Carval Th, Casey K, Donlon C, Lauret O, Loubrieu T, Srinivasan A, Trinanes J, Godoy O, Mendelssohn R, Signell R, De La Beaujardiere J, Cornillon P, Blanc F, Rew R, Harlan J (2010) NETCDF-CF-OPENDAP: standards for ocean data interoperability and object lessons for community data standards processes, Oceanobs 2009, Venice Convention Centre, 21–25 Septembre 2009, Venise, publication date 2010-12-23, http://archimer.ifremer.fr/doc/00027/13832/10969.pdf. Accessed 16 Sept 2014
Harley D, Kryzys Acord S, Earl-Novell S, Lawrence, S, Judson King C (2010) Assessing the future landscape of scholarly communication: An exploration of faculty values and needs in seven disciplines. Center for Studies in Higher Education, UC Berkeley. http://escholarship.org/uc/cshe_fsc. Accessed 17 Sept 2014
IEDA (2014) Integrated earth data applications. http://www.iedadata.org. Accessed 16 Sept 2014
INSPIRE (2014) Infrastructure for spatial information in the European Community (INSPIRE). http://inspire.ec.europa.eu. Accessed 16 Sept 2014
IODE (2014) International oceanographic data and information exchange. http://www.iode.org,. Accessed 16 Sept 2014
IRIS (2014) Incorporated research institutes for seismology. http://www.iris.edu/hq/. Accessed 23 Dec 2014
JoRD (2013) JoRD: Journal research data policy bank project. http://jordproject.wordpress.com. Accessed 16 Sept 2014
Kozak M, Hartley J (2013) Publication fees for open access journals: Different disciplines—different methods. J Am Soc Inf Sci Technol 64 (12). doi:10.1002/asi.22972
Kratz J (2014) Fifteen ideas about data validation (and peer review). Data Pub [Blog], http://datapub.cdlib.org/2014/05/08/fifteen-ideas-about-data-validation-and-peer-review. Accessed 16 Sept 2014
Laakso M, Welling P, Bukvova H, Nyman L, Björk B-C et al (2011) The development of open access journal publishing from 1993 to 2009. PLoS One 6(6):e20961. doi:10.1371/journal.pone.0020961
Lavoie B (2008) The open archival information system reference model: introductory guide. Microform Imaging Rev 33(2):68–81. doi:10.1515/MFIR.2004.68
Lawrence B, Jones C, Matthews B, Pepler S, Callaghan S (2011) Citation and peer review of data: moving towards formal data publication. Int J Digit Curation 6(12):4–37
Leadbetter A, Raymond L, Chandler C, Pikula L, Pissierssens P, Urban E (2013) Ocean Data Publication Cookbook. Paris: UNESCO, 39pp. (Intergovernmental Oceanographic Commission Manuals and Guides 64). http://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=10574. Accessed 9 Mar 2014
Lecomte P, Stensaas G (2009) Overview of progress towards a data quality assurance strategy to facilitate interoperability. http://www.earthobservations.org/documents/committees/adc/200909_11thADC/DA-09-01a%20QA4EO.pdf. Accessed 22 Apr 2014
MBL WHOI Library (2014) http://www.mblwhoilibrary.org. Accessed 15 Sept 2014
MANTRA (2014) Research data MANTRA.http://datalib.edina.ac.uk/mantra. Accessed 15 Sept 2014
Marshall P, Tufo H, Keahey K, La Bissoniere D (2012) Architecting a large-scale elastic environment-recontextualization and adaptive cloud services for scientific computing. In Proceedings of ICSOFT:409–418
Mendeley (2014) Mendeley. http://www.mendeley.com Accessed 15 Sept 2014
Mooney H, Newton MP (2012) The anatomy of a data citation: discovery, reuse and credit. J Librariansh Sch Commun 1(1):eP1035
NASA (2014) EOSDIS: NASA’s earth observing system data and information system. https://earthdata.nasa.gov. Accessed 15 Sept 2014
National Research Council (2012) For attribution – developing data attribution and citation practices and standards. National Academies Press, Washington, DC
Nativi S, Craglia M, Pearlman J (2012) The brokering approach for multidisciplinary interoperability: a position paper. Int J Spat Data Infrastruct 7:1–15
Nativi S, Craglia M, Pearlman J (2013) Earth science infrastructures interoperability: the brokering approach. J Sel Top Appl Earth Obs Remote Sens 6:1118–1129. doi:10.1109/JSTARS.2013.2243113
Nature Publishing Group (2014) Scientific data. http://www.nature.com/sdata. Accessed 15 Sept 2014
Neilsen M (2011) Reinventing discovery: the vew era of networked science. Princeton University Press
NERC (2014) Data centres. http://www.nerc.ac.uk/research/sites/data. Accessed 15 Sept 2014
NOAA IOOS (2014) Quality assurance of real time ocean data, QARTOD. http://www.ioos.noaa.gov/qartod/welcome.html. Accessed 15 Sept 2014
NSB (2011) NSB 11–79 digital research data sharing and management: report of the Task Force on Data Policies. Tech. rep. National Science Board
NSF (2010) National science foundation data management plan. http://www.nsf.gov/bfa/dias/policy/dmp.jsp. Accessed 15 Sept 2014
NSF (2014) National Science Foundation Directorate for Geosciences: Earth cube. http://www.nsf.gov/geo/earthcube/. Accessed 22 Apr 2014
OGC (2014) Open Geospatial Consortium. http://www.opengeospatial.org. Accessed 15 Sept 2014
OneGeology (2014) OneGeology.http://www.onegeology.org, Accessed 24 Apr 2014
Onoda M (2012) GEOSS Data sharing principles and action plan. Workshop on GMES Data and Information Policy, Brussels http://ec.europa.eu/enterprise/newsroom/cf/_getdocument.cfm?doc_id=7140. Accessed 15 Sept 2014
OOI (2014) Ocean observatories initiative. http://oceanobservatories.org/. Accessed 18 Sept 2014
Open Knowledge Foundation (2012) The open data handbook. http://opendatahandbook.org/en/what-is-open-data. Accessed 15 Sept 2014
Palfrey J, Gasser U (2012) Interop: the promise and perils of highly interconnected systems. Basic Books
Pangaea (2014) Pangaea: Data publisher for the earth & environmental sciences. http://www.pangaea.de. Accessed 15 Sept 2014
Parsons MA, Fox P (2013) Is data publication the right metaphor? Data Sci J 12:WDS32–WDS46
Parsons MA, Duerr R, Minster J-B (2010) Data citation and peer review. Eos: Trans Am Geophys Union 91(34):297–299
Pearlman J, Shibasaki R (2008) Guest editorial: global earth observation system of systems. IEEE Syst J 2(3):302–303. doi:10.1109/JSYST.2008.928859
Pearlman J, Williams A, Simpson P (eds) (2013) Report of the research coordination network: RCN OceanObs Network: facilitating open exchange of data and information. NSF/Ocean Research Coordination Network Tech Rep. 46 pp
Penev L, Erwin T, Mille J, Chaqvan V, Motitz T, Griswold C (2009) Publication and dissemination of dataset in taxonomy: ZooKeys working example. ZooKeys 11:1–8
PILA, Inc (2013) CrossRef. http://www.crossref.org. Accessed 15 Sept 2014
Piwowar H (2011) Who shares? Who doesn’t? Factors associated with openly archiving raw research data. PLoS One 6(7):e18657
Piwowar HA, Vision TJ (2013) Data reuse and the open data citation advantage. PeerJ 1:e175
Piwowar HA, Day RS, Fridsma DB (2007) Sharing detailed research data is associated with increased citation rate. PLoS One 2(3):e308. doi:10.1371/journal.pone.0000308
Planet OS (2014) Planet OS: Big data platform for multi-sensor and machine data. https://planetos.com/. Accessed 21 Sept 2014
President Barack Obama (2013) Memorandum on open data policy–managing information as an asset (May 9, 2013). http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf. Accessed 17 Sept 2014
Reichman O, Jones M, Schildhauer M (2011) Challenges and opportunities of open data in ecology. Science 331(6018):703–705. doi:10.1126/science.1197962
Research Information (2014) Taylor & Francis partners with figshare for supplementary data.http://www.researchinformation.info/news/news_story.php?news_id=1485. Accessed 15 Sept 2014
Research Councils UK (2014) http://www.rcuk.ac.uk/research/datapolicy. Accessed 15 Sept 2014
Research Data Alliance (2014). Research data sharing without barriers.https://www.rd-alliance.org. Accessed 15 Sept 2014
Research Information Network (2008). To share or not to share: publication and quality assurance of research data outputs. A report commissioned by the Research Information Network. http://www.rin.ac.uk/data-publication. Accessed 15 Sept 2014
Reuters T (2013) Science citation index. http://science.thomsonreuters.com/cgi-bin/jrnlst/jloptions.cgi?PC=K. Accessed 1 Mar 2014
Reuters T (2014) The data citation index. http://wokinfo.com/products_tools/multidisciplinary/dci. Accessed 11 Mar 2014
Sayogo DS, Pardo T (2012) Exploring the motive for data publication in open data initiative: Linking intention to action. In: Proceedings of the 45th Hawaii International Conference on System Sciences, IEEE Computer Society
ScienceDirect (2014) http://www.sciencedirect.com. Accessed 15 Sept 2014
SCOR/MBLWHOI/IODE (2014) Data publication/data citation project. http://www.iode.org/index.php?option=com_content&view=article&id=110&Itemid=12. Accessed 15 Sept 2014
Sears J (2011) Data sharing effect on article citation rate in paleoceanography. Eos, Trans. AGU, 92, Fall Meet. Suppl., Abstract /IN53B-1628
Silva L (2014) PLoS new data policy: public access to data. http://blogs.plos.org/everyone/2014/02/24/plos-new-data-policy-public-access-data-2. Accessed 15 Sept 2014
Smit E (2010) Preservation, access and re-use of research data. Presented at DataCite Summer Meeting 2010. https://www.datacite.org/datacite_summer_meeting_2010. Accessed 15 Sept 2014
SURF (2013) Enhanced publications. Collaborative organisation for ICT in Dutch higher education and research. http://www.surf.nl/en/themes/research/research-data-management/enhanced-publications. Accessed 15 Sept 2014
Tenopir C, Allard S, Douglass K, Aydinoglu AU, Wu L, Read E, Manoff M, Frame M (2011) Data sharing by scientists: practices and perceptions. PLoS One 6(6):e21101
Thessen A, Patterson D (2011) Data issues in the life sciences. ZooKeys 150:15–51. doi:10.3897/zookeys.150.1766
Turnitsa C (2005) Extending the levels of conceptual interoperability model. In: Proceedings IEEE summer computer simulation conference, IEEE CS Press
UKDS (2014) UK Data Service, Citing Data and Re-Share. http://ukdataservice.ac.uk/media/440282/publishingcitigdata.pdf. Accessed 10 Mar 2014
US Congress (1980) Bayh-Dole act. Public Law 96–517, also known as the Patent and Trademark Law Amendments Act; enacted by the United States Congress.
Vision T (2010) Open data and the social contract of scientific publishing. Bioscience 60(5):330–331. doi:10.1525/bio.2010.60.5.2
W3C (2001) URIs, URLs, and URNs: Clarifications and recommendations 1.0. http://www.w3.org/TR/uri-clarification. Accessed 10 Mar 2014
WDS (2014) Data Publication Working Group. http://icsu-wds.org/community/working-groups/data-publication. Accessed 10 Mar 2014
Whiteside A, Evans JD (2006) Web coverage service implementation specification #06-083r8, version 1.1.0. https://portal.opengeospatial.org/files/?artifact_id=18153, Access 19 May 1024
Whitfield P (2012) Why the provenance of data matters: assessing “fitness for purpose” for environmental data. Can Water Resour J 37(1):23–36. doi:10.4296/cwrj3701866
Whitlock M (2011) Data archiving in ecology and evolution: best practices. Trends Ecol Evol 26(2):61–65. doi:10.1016/j.tree.2010.11.006
WHOAS (2014) Woods Hole Open Access Server. https://darchive.mblwhoilibrary.org. Accessed 16 Sept 2014
Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M et al (2012) Darwin core: an evolving community-developed biodiversity data standard. PLoS One 7(1):e29715. doi:10.1371/journal.pone.0029715
Wiley (2014) Geoscience data journal. http://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%292049-6060. Accessed 11 Mar 2014
World Meteorological Organization (2014) Information management. http://www.wmo.int/pages/themes/wis/index_en.html. Accessed 11 Mar 2014
Acknowledgments
The authors would like to thank other members of the NSF Research Coordination Network “OceanObsNetwork”: Milton Kampel; Takeshi Kawano; Fred Maltz; Michael McCann; Benoit Pirenne; Peter Pissierssens; Iain Shepherd; Christoph Waldmann; and Albert Williams III, who contributed to the report that this paper summarizes and enhances (Pearlman et al. 2013). The authors acknowledge the support of the National Science Foundation through Grant Award No. OCE-1143683.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. A. Babaie
Rights and permissions
About this article
Cite this article
Gallagher, J., Orcutt, J., Simpson, P. et al. Facilitating open exchange of data and information. Earth Sci Inform 8, 721–739 (2015). https://doi.org/10.1007/s12145-014-0202-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-014-0202-2