Background

Marine microbes inhabit all marine habitats, are the engines of the ocean’s major biogeochemical cycles, and form the basis of the marine food web [1]. Over the past decades scientists have aimed to understand marine microorganisms, but technical and computational limitations have restricted studies to a local scale. Fortunately, with technological advancements and decreasing sequencing costs, genomic studies have become feasible on a global scale. The first landmark marine metagenome studies were published by the J Craig Venter Institute, beginning with a pilot sampling project in the Sargasso Sea followed by the Global Ocean Sampling (GOS) expedition [2]. The Tara Ocean project expanded this further by integrating the marine genetic, morphological, and functional biodiversity in its environmental context at global ocean scale and at multiple depths [3]. The Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project now aims to investigate global marine microbial biodiversity and has pioneered the idea to do this on a single orchestrated Ocean Sampling Day (OSD).

Main text

Ocean Sampling Day

OSD is a simultaneous, collaborative, global mega-sequencing campaign to analyze marine microbial community composition and functional traits on a single day. On June 21st 2014 – the world’s first major OSD event – scientists around the world collected 155 16S/18S rRNA amplicon data sets, 150 metagenomes, and a rich set of environmental metadata. Standardized procedures, including a centralized hub for laboratory work and data processing via the Micro B3 Information System (Micro B3-IS), assured a high level of consistency and data interoperability [4]. Application of the Marine Microbial Biodiversity, Bioinformatics and Biotechnology (M2B3) standards ensures sustainable data storage and retrieval in respective domain-specific data archives [4]. OSD generated the largest standardized data set on marine microbes taken on a single day, which we consider complementary to other large-scale sequencing projects.

The solstice was chosen to test the hypothesis that diversity negatively correlates with day-length [5]. Data analysis will target three main areas: biodiversity, gene functions, and ecological models. OSD sampling sites are typically located in coastal regions within exclusive economic zones (EEZ). Therefore, the OSD data set provides a unique opportunity to test anthropogenic influences on microbial population ecology. We will perform a multi-level assessment of the human impact on microbial mediated biogeochemical cycles. Questions we would like to answer are: (i) what are the important factors (physical-chemical and biological) in structuring biodiversity patterns and range margins, and (ii) are functions associated with heavy metals, antibiotics or fecal indicators correlated with OSD sites exposed to higher human impact? We are confident that the simultaneous collection of samples will result in the discovery of new ecological patterns providing key information towards understanding environmental vulnerability and resilience.

Open access strategy and sharing of data

All OSD data are archived and immediately made openly accessible without an embargo period, following the Fort Lauderdale rules for sharing data [6]. Sequence and contextual data are publicly available via the International Nucleotide Sequence Database Collaboration (INSDC) umbrella study PRJEB5129 and at PANGAEA. A model agreement and OSD Data Policy [4] was developed in compliance with the Convention on Biological Diversity and the Nagoya Protocol on Access and Benefit Sharing (ABS) for the utilization on genetic resources in a fair and equitable way. An ABS Helpdesk exists to support OSD participants’ legal questions. Furthermore, the Mediterranean Science Commission (CIESM) developed the CIESM Charter on ABS, which has been endorsed by 391 scientists from 49 countries (as of April 2015).

The OSD Consortium

At the 16th Genomics Standards Consortium (GSC) meeting in 2014, the OSD community agreed to form the OSD Consortium. Led by the five OSD Coordinators and comprising of up to 130 OSD Site Coordinators and their teams, the OSD Consortium installed the infrastructure and expertise allowing coordinated OSD events to take place. Furthermore, the OSD Consortium aims to foster collaborations and share expertise among and beyond the OSD network, and to connect scientists in a worldwide environmental movement.

Membership and governance

OSD membership is open to anyone and is earned by participation. Registered participants are provided with privileged access to the OSD network of sites, as well as training activities. OSD samples are prioritized for all types of data generation (as funds and resources allow). In return, participants agree to provide samples according to OSD’s standardized procedures and to work under the umbrella of the OSD Data Policy, which requires open sharing of data and to respect the national legal sampling framework.

The OSD network of sites

Participants from 191 sampling sites signed up for the main OSD event; these sites range from tropical waters to polar environments (Fig. 1). All major oceanic divisions (Pacific, Atlantic, Indian, Antarctic and Arctic Ocean) and continents are covered with 81 and 37 sites in Europe and North America, respectively. The majority of sites are located in the Northern Hemisphere (172), including 36 sites in the Mediterranean and three sites in the Black Sea.

Fig. 1
figure 1

Map of registered sites for OSD 2014

OSD partnerships

Endorsement of the community and fruitful partnerships are essential. Supported by the Argonne National Laboratory, the generous cooperation with the Earth Microbiome Project (EMP) [7] enabled us to perform amplicon sequencing for OSD pilot events; these were conducted on each of the solstices in 2012 and 2013. In return, OSD data is EMP compliant and contributes towards construction of a global catalog of microbial diversity [7]. Cooperation with the LifeWatch project secured additional 18S rRNA gene sequencing, while Pacific Bioscience contributed sequencing of near-full-length 16S rRNA gene amplicons and metagenomes from selected OSD sampling sites. Moreover, the partnership with the Smithsonian Institute’s Global Genome Initiative for long-term bioarchiving of all OSD samples enables the community to re-analyze the samples in the future.

OSD beyond 2014

The OSD Consortium aims to expand in terms of sites and methods, as well as towards multicellular organisms. Future key tasks are to align closely with the Genomic Observatories (GOs) Network [8] towards biocoding the ocean, as well as to secure long-term resources and commitments to create an OSD time-series. The mid-term vision of the OSD Consortium is to generate microbial Essential Biodiversity Variables (EBV) data [9]. The envisioned regular OSD events would qualify for the candidate EBVs “Species populations” and “Community composition” to indicate, for example, vulnerability of ecosystems and climatic impacts on community composition. In the long term such indicators may be incorporated into the Ocean Health Index (OHI) [10], which currently excludes microorganisms from biodiversity assessment due to the lack of reliable data. OSD has the potential to close that gap and amend EBV and OHI by expanding oceanic monitoring towards microbes. This could lead to a global system of harmonized observations to inform scientists and policy-makers.

Conclusions

This commentary outlines the process for creating, managing and formalizing the OSD Consortium and describes its vision for a sustainable study of marine microbes. As we move forward, we will continue to explore and expand the scope of OSD beyond 2014. The idea of an OSD time-series is still in its early days but incorporating the OSD data set as EBVs and in the OHI is a strong source of motivation since this could pave the way to prioritize scientific research and raise public awareness for the unseen majority of the world’s oceans.