Microbial metagenomics in the Baltic Sea: Recent advancements and prospects for environmental monitoring

Metagenomics refers to the analysis of DNA from a whole community. Metagenomic sequencing of environmental DNA has greatly improved our knowledge of the identity and function of microorganisms in aquatic, terrestrial, and human biomes. Although open oceans have been the primary focus of studies on aquatic microbes, coastal and brackish ecosystems are now being surveyed. Here, we review so far published studies on microbes in the Baltic Sea, one of the world’s largest brackish water bodies, using high throughput sequencing of environmental DNA and RNA. Collectively the data illustrate that Baltic Sea microbes are unique and highly diverse, and well adapted to this brackish-water ecosystem, findings that represent a novel base-line knowledge necessary for monitoring purposes and a sustainable management. More specifically, the data relate to environmental drivers for microbial community composition and function, assessments of the microbial biodiversity, adaptations and role of microbes in the nitrogen cycle, and microbial genome assembly from metagenomic sequences. With these discoveries as background, prospects of using metagenomics for Baltic Sea environmental monitoring are discussed.


INTRODUCTION
Abundant and ubiquitous environmental microorganisms are important drivers of global biogeochemical cycles, and understanding factors controlling their abundance, activities, and diversity is therefore a major area of research. Studying microbes in situ, however, is challenging.
Microbial communities are composed of mixed and diverse assemblages, often reside in hard to sample habitats, and a great majority is un-cultivable in the laboratory. The development of high throughput sequencing (HTS) technologies during the last decade has therefore in many ways revolutionized the study of natural communities of microbes, now characterized as the ''unseen majority.'' This technology allows, in a cost-effective way, the analysis of diversity, metabolic functions, and biological interactions in complex, uncultured microbial communities. Extraction of DNA from mixed communities of microbes, followed by HTS, has greatly increased our understanding of the fundamental roles played by terrestrial, aquatic, and humanassociated microbiota (Fierer et al. 2007;Zinger et al. 2011;Huttenhower et al. 2012). Pioneering work analyzing sequenced environmental DNA (eDNA) from marine bacterial communities led to the discovery that much of the microbial diversity in the world's oceans, from surface to deep waters, had been severely underestimated (Venter et al. 2004;DeLong et al. 2006). Important findings from HTS of eDNA or RNA include the discovery of novel genes, proteins, and microbial species (Yooseph et al. 2007;Gilbert et al. 2008), and findings related to the role of microbes in global biogeochemical cycles of carbon and nitrogen (Frias-Lopez et al. 2008). More recent analyses have also shown that genomic plasticity and metabolic versatility of microorganisms are the basis for bacterial adaptation in marine ecosystems (Konstantinidis et al. 2009;Lauro et al. 2009;Yooseph et al. 2010). Hence, the development and introduction of these potent technologies have enabled a much more detailed view of microbial communities and their functions in natural settings and have profoundly changed our perception of microbial life, genome evolution, and minimal requirements for life (Karl 2007;Giovannoni et al. 2014).
HTS technology-based analyses of natural microbial communities can be divided into two sub-classes. The more commonly used ''HTS-signature-gene'' approach surveys eDNA for single marker gene distributions and abundances and often uses the conserved 16S rRNA gene, encoding the small subunit of the prokaryotic ribosome. In contrast, the metagenomic shotgun approach potentially exposes the majority of genes/genomes present in eDNA extracted from natural microbial populations. Due to the non-targeted nature of the metagenomic approach, substantial amounts of diverse genetic data and information on functional potential of entire microbial communities may be obtained. Metagenomics, however, holds its own biases. As a result of differences in genome sizes and genetic sequence (e.g., high or low GC content), DNA from different genes and organisms are not uniformly covered during sequencing, nor can all sequences be correctly annotated (identified) due to lack of experimental evidence for protein-coding sequences and the still limited number of sequenced microbial genomes available in databases. Metagenomics is increasingly often combined with metatranscriptomics for which all RNA molecules in a natural sample are targeted, expanding the scope of metagenomics by also providing information on gene expression.
Most early metagenomic studies focused on oceans, while smaller seas, freshwater systems, and, in particular, brackish-water transitions until recently have remained poorly investigated using HTS. The Baltic Sea is the world's second largest body of brackish water and represents one of the most intensely researched and monitored aquatic environments, with a time series of hydrographic data measured routinely for over 100 years (Fonselius and Valderrama 2003). The contemporary Baltic Sea is, since the turn of the twentieth century, negatively affected by anthropogenic disturbances, specifically urban and agricultural derived eutrophication, fueling phytoplankton blooms causing increased anoxia and hypoxia in deep waters (Savage et al. 2010;Carstensen et al. 2014). In addition, the Baltic Sea offers steep gradients in salinity and key-nutrient concentrations. These gradients are semiconstant over time but change dramatically over a short geographical distance, giving rise to a challenging environment for many marine and freshwater organisms. Paired with geographic isolation, this has resulted in low species and genetic diversity among metazoans (e.g., fish, seals) and macrophytes (e.g., macro-algae) (Johannesson and Andre 2006). The diversity and biogeography of microbes in the Baltic Sea and associated waters have, however, attracted considerably less research.
Here, we provide an updated account on how the introduction of HTS-based analyses into Baltic Sea microbial research has contributed to the understanding of Baltic Sea microbes. Recent data collectively illustrate that Baltic Sea microbes are both unique and highly diverse, and well adapted to this brackish water ecosystem. To widen the scope, the Baltic Sea findings are placed in the context of metagenomic microbial findings in oceans in general. Finally, the ecological significance of microbes in any environment suggests a need for implementation of HTS data into future environmental monitoring programs, prospects of which are discussed here in a Baltic Sea perspective.

ENVIRONMENTAL SEQUENCING OF BALTIC SEA MICROBES
Earlier reports on Baltic Sea microorganisms focused largely on quantifying abundance and activity in relation to physicochemical parameters (see e.g., Hagström and Larsson 1984;Gast and Gocke 1988;Rheinheimer et al. 1989). The introduction of molecular-based approaches expanded our knowledge of microbial community composition, seasonal succession, and phylogenetic diversity, revealing for instance temporal patterns for specific bacterial phylotypes (Pinhassi et al. 1997;Pinhassi and Hagström 2000) and strong influence of freshwater phyla on brackish water communities (Riemann et al. 2008). A pre-HTS-era metagenomic analysis (using a cloning approach) of Baltic Sea sediment microbial communities was published in 2007 (Hårdeman and Sjöling 2007), while the first HTS-signature-gene-based (pyro-sequencing) study appeared in 2010 (Andersson et al. 2010). This study targeted bacterioplankton at the Landsort Deep, the deepest (459 m) location, and long-term monitoring site in the Baltic Sea (Baltic proper). Subsequently, a number of HTS-based studies of the Baltic Sea microbial life have followed. As seen in Table 1, six studies are based on the HTS-signature-gene approach and four on random metagenomic sequencing of all genes. The work of Feike et al. (2012) stands out by being the only purely metatranscriptomic analysis, and yet another recent study describes genome assembly based on metagenomic sequences (Herlemann et al. 2013).

BACTERIAL BIOGEOGRAPHY AND DIVERSITY
In addition to steep horizontal and vertical concentration gradients related to salinity and key-nutrients, such as nitrogen and phosphorous, there is also a pronounced seasonal variation in both nutrient concentrations and temperature in the Baltic Sea. The effects of these variables were investigated by Andersson et al. (2010) using a 16S rRNA HTS approach to study bacterioplankton communities at Landsort Deep. A pronounced influence of both phosphorous and temperature on the microbial community, notably composed of characteristic freshwater bacteria such as actinobacteria, betaproteobacteria, and verrucomicrobia, was found. While some early studies suggested that salinity may influence the Baltic Sea bacterial growth and biogeography (Väätänen 1980;Heinanen 1991), the extent of this effect was not realized until the first large-scale 16S rRNA gene inventory along the entire Baltic Sea salinity gradient was performed (Herlemann et al. 2011). A subsequent comprehensive metagenomic survey substantiated the strong structuring of the bacterial community composition along the Baltic salinity gradient and also included the freshwater Lake Torne Träsk and the marine waters off the Swedish west coast as additional reference points . Among eubacteria, a clear dominance of actinobacteria was apparent in the low-salinity Bothnian Bay in the northern Baltic Sea, while a shift toward dominance of proteobacteria (mainly alpha and gamma) was apparent at higher salinities (including at the Swedish west coast). As in most global oceans, the dominant bacteria in the Baltic Sea were alphaproteobacteria of the SAR11 clade (Morris et al. 2002;Dupont et al. 2014). Overall the community compositions at the phylum level in the two studies were largely in agreement and both reported a unique autochthonous brackish bacterial population present at intermediate salinity stations, including strains of SAR11 and picocyanobacteria (Herlemann et al. 2011;Dupont et al. 2014).
The variation in bacterial community composition seen in the Baltic Sea along the salinity transect is considerably more dramatic than in most oceanic habitats . A comparative network-analysis of metagenomes (both with respect to taxonomy and functional potential) collected at various depths (from surface to anoxic sediments) in the Baltic Sea and 27 metagenomes from 11 sites worldwide showed that the Baltic Sea bacterial communities clustered primarily with metagenomes from the western English Channel (Thureborn et al. 2013). Unique to the Baltic Sea, however, was the community derived from the metagenome collected at the oxic-anoxic interface, being an outlier with few taxonomic similarities to any other community (Thureborn et al. 2013). It should, however, be emphasized that the reference metagenomes used in these analyses were all from marine environments (except two that were from terrestrial), and that only one geographic site in the Baltic Sea was included. The increased availability of metagenomes from the whole Baltic Sea salinity transect now warrants expanded comparative studies.
Other findings from comparing HTS analyses from the Baltic Sea to other marine environments relate to the diversity of the Baltic Sea microbial community. In one of the first HTS/16S rRNA based studies, a lower bacterial diversity was observed in the central eastern Baltic Sea (northern Baltic proper/Gulf of Finland) compared to that of some investigated fully marine oceanic habitats (Koskinen et al. 2011). However, later HTS-based analyses (16S rRNA and metagenomic) of the genetic diversity of bacteria along the whole longitudinal expansion of the Baltic Sea did not show any such reduced diversity at intermediate salinities (Herlemann et al. 2011;Dupont et al. 2014). In contrast, these studies rather demonstrate a surprisingly high microbial diversity throughout the Baltic Sea. This suggests that Baltic Sea microorganisms are less  (2010) a Calculated based on sample average impaired by genetic isolation than are macro-organisms (Johannesson and Andre 2006), presumably underpinned by a combination of short microbial doubling times (hours/days) and comparatively small and flexible genomes. A characteristic of microbes is also frequent horizontal gene transfer events between sympatric microbes, a phenomenon that was recently investigated in Baltic Sea picocyanobacteria (Larsson et al. 2011).

MICROBIAL COMMUNITY FUNCTION: FROM GENE FREQUENCIES TO GENOME ASSEMBLY
Metagenomic data may be used to predict metabolic potentials of microbes as they expose existing gene repertoires and related metabolic processes. Although other factors such as regulation of gene transcription and enzyme activities, and the availability of substrates, are of critical importance, the relative frequencies of specific genes may point to their functional importance in an environment. While the effect of salinity in shaping microbial communities is well known (see e.g., Lozupone and Knight 2007;Campbell and Kirchman 2013), the mechanism behind salinity being such a strong barrier to cross for bacteria is not. However, a recent metagenomic analysis, encompassing the Baltic Sea salinity gradient, revealed that salinity does not only influence the distribution of traits such as ion transporters (e.g., Na, K) and biosynthesis and transport of compatible solutes, but also bacterial metabolic core functions such as respiration, glycolysis, and cofactor biosynthesis . Notably, analogous metabolic pathways, with approximately the same outcome but via different intermediate metabolites and genes, were found to have opposite abundance patterns along the salinity gradient. For instance, the glycolytic Entner Doudoroff (ED) pathway dominated at high salinity while the Embden-Meyerhof (EM) pathway at low salinity ). Low-salinity adapted bacteria overall use pathways with a higher ATP yield than bacteria in marine environments. These metabolic differences may explain the distinct divide known to exist between fresh and marine microbial communities. It further suggests that adaptation to a lower salinity may be based on a core gene set with higher energy yield. This discovery now calls for further exploration. Other important findings from metagenomic functional analyses of the Baltic Sea microbes relate to the impact of eutrophication and pollution. Nutrients (nitrogen and phosphorus) are the second most important factor (besides salinity) in shaping the distribution of microbial taxa and their functional potential in Baltic Sea surface water communities ). More specific information on the influence of eutrophication and pollution was obtained from a metagenomic study of the Landsort Deep microbial community (Thureborn et al. 2013). The functional gene repertoire showed a comparatively high abundance of microbial genes involved in attachment to and degradation of organic carbon and in heavy metal resistance (e.g., against cobalt, cadmium, and zinc). These findings are likely related to organic matter deposition and the high concentrations of metals in sediments at this site (Thureborn et al. 2013). Overall, the resulting gene diversity at this specific site appears to be shaped by anthropogenic pollution and eutrophication. However, it should be noted that the deeper waters at this site offer radically different conditions (hypoxia or anoxia) compared to the rest of the Baltic Sea, affecting the overall bacterial biodiversity dramatically ). The effect of dissolved organic carbon (DOC) on the Baltic Sea bacterial community was also recently investigated by combining mesocosms and metagenomic analyses (Dinasquet et al. 2013;Herlemann et al. 2014). While a weak effect of the DOC on structuring the community was the norm, some taxa were clearly influenced. This approach may be an efficient tool to evaluate details in successional changes as a response to nutrient regimes.
To further explore putative functions, (meta-)genomic assembly data in the form of contigs, i.e., longer genomic sequences obtained by aligning partly overlapping sequence reads, may be analyzed. For example, a recent analysis of light-harvesting genes in Baltic Sea identified a novel gene cluster in the picocyanobacterial population . These cyanobacteria, with a cell size \2 lm, are major primary producers in oceans (Scanlan et al. 2009). This is also the case in the Baltic Sea where picocyanobacteria may constitute up to 80 % of the cyanobacterial population (Stal et al. 2003;Hajdu et al. 2007). Metagenomic analyses show that the Baltic Sea picocyanobacteria are dominated by strains belonging to the genera Synechococcus and Cyanobium (unpublished results) and that members of the dominant Synechococcus clade harbor a novel gene cluster encoding proteins for a unique set of light-harvesting antennae, i.e., pigment-associated phycobilisomes, not previously found in cyanobacteria . The organization of the gene cluster suggests the involvement of multiple horizontal gene transfer events. The Baltic Sea picocyanobacteria may have evolved a set of phycobiliproteins with a potentially unique absorption spectrum, to specifically match light conditions offered by the Baltic Sea. These findings exemplify how ''meta-omic'' datasets can provide novel insights and generate hypotheses-driven research, particularly targeting processes in the large segment of still non-cultivable aquatic microbes, including those in the Baltic Sea.
With recent developments in bioinformatic analysis of metagenomic data, it is possible to assemble not only contigs of a limited length but also assemble near complete genomes of abundant organisms (Iverson et al. 2012). For example, metagenomic time-series samples from the Baltic Sea were used to assemble a genome from an aquatic phylotype of the verrucomicrobia Spartobacteria (Herlemann et al. 2013), which is one of the dominant organisms in the Baltic Sea bacterial community during the summer (Herlemann et al. 2011). Analysis of the assembled Spartobacteria genome gave important information about the metabolic capacity of the bacterium, including the presence of 23 glycoside hydrolases, giving the bacterium the capacity to metabolize a number of different carbohydrates and suggesting a potentially important role in carbon cycling. Based on patters of co-occurring abundances, it was further suggested that the carbon was mainly derived from cyanobacterial blooms. It is clear that this approach can take the metagenomic scope even further and provides an important step in increasing the number of microbial genomes available, one of the prerequisites for correct annotation of metagenomic sequences.

THE NITROGEN CYCLE: FROM ANOXIC ZONES TO SURFACE WATERS
Today, the Baltic Sea suffers from large and persistent anoxic bottom zones. This is partly a natural phenomenon caused by strong stratification, which prevents vertical mixing. Eutrophication of the Baltic Sea has increased the area of these zones (Carstensen et al. 2014), and they today constitute the largest anthropogenically induced hypoxic area in the world.
The nitrogen cycle consists of microbially mediated transformations of nitrogen, some of which are dependent on reducing conditions (e.g., denitrification and anamox). Research concerning hypoxic environments in the Baltic Sea has therefore often targeted these processes, more recently using HTS technologies. Microbial metagenomes from the Landsort Deep water column illustrate a stratification of the microbial functional capacities along the depth and oxygen profile (Thureborn et al. 2013;Dupont et al. 2014). While genes for the anamox reaction were absent, high frequency of genes involved in denitrification prevailed at the deepest anoxic sites (Thureborn et al. 2013). It was furthermore suggested that the denitrification at this depth was primarily carried out by chemolitotrophic (sulfur oxidizing) denitrifying epsilonbacteria. Later, also Dupont et al. (2014) observed a high prevalence of epsilonbacteria in these specific waters. Together the findings suggest an important role of these organisms in denitrification at the Landsort Deep.
The substrate for epsilonbacterial denitrification (nitrate) was in the Thureborn et al. (2013) suggested to originate from aerobic ammonia oxidizing thaumarchaeota. High abundance of ammonia oxidizing thaumarchaeota was indeed previously observed in the suboxic zones (70-120 m depth) of the central Baltic Sea (Labrenz et al. 2010). More recently, metatranscriptome analyses substantiated these findings by showing that the transcript level of ammonia oxidation genes (amoA, amoB, and amoC) was high in the suboxic zones of both Landsort and Gotland Deep (Feike et al. 2012). The nitrifying potential of pelagic archaea was first demonstrated by an early metagenomic study (Venter et al. 2004). Since then ammonia-oxidizing archaea have been found to be widely distributed in the world's oceans and likely play a significant role in the global nitrogen cycle (Erguder et al. 2009).
Fixation of atmospheric dinitrogen is a microbial process that has received considerable research attention (Gruber 2005). This is particularly the case for the Baltic Sea, with its typical massive summer blooms of nitrogenfixing cyanobacteria (Stal et al. 2003, Kahru andElmgren 2014). In fact, the nitrogen fixation by the large filamentous cyanobacterial blooms represents the second largest source of ''new'' nitrogen (N) input into the Baltic Sea after riverine load (Larsson et al. 2001). The principal enzyme that catalyzes nitrogen fixation, nitrogenase, is encoded by highly conserved nif genes (nifKDH encoding the structural protein), and these genes are obvious targets in metagenomic and metatranscriptomic surveys. Metagenomic surveys for nif genes in the surface waters of the ocean have found surprisingly few sequence reads, despite high rates of nitrogen fixation repeatedly recorded by bloom-forming cyanobacteria (Johnston et al. 2005). The reasons for rare nif gene findings in metagenomes may be due to a comparatively restricted distribution of these genes among organisms in the massive microbial metagenomic datasets (Johnston et al. 2005). To address this difficulty, it has been suggested that to properly expose all potential nitrogenfixers, a minimum set of six nif genes should be targeted, namely nifHDK and nifENB, (Dos Santos et al. 2012). Even when including these nif genes in metagenomic analysis of Baltic Sea surface waters, few nif sequences were retrieved (Thureborn et al. 2013). This was explained by pre-bloom sampling and to the use of a pre-filtration step (\3.0 lm), which may have excluded the dominant larger nitrogenfixing filamentous cyanobacteria. A considerably higher number of nif gene sequences were later retrieved from a Baltic Sea metagenomic dataset sampled in July and including larger sized microbes (3.0-200 lm) .
Despite the relatively few nif sequences retrieved from surface waters in the Baltic Sea, nif gene abundances increased with depth at Landsort Deep (Thureborn et al. 2013). This suggests the involvement of heterotrophic organisms, in this case, sulfate-reducing Deltaproteobacteria (comprising 36 % of the nif genes at this site), in Baltic Sea nitrogen fixation. Combining nifH HTS analyses, gene expression measurements and nitrogen fixation rate determinations for Baltic Sea microbes (Farnelid et al. 2013) showed that heterotrophic nitrogen fixation may account for up to 6 % of the total annual nitrogen fixation. Heterotrophic nitrogen fixation rates have recently been documented in hypoxic waters of, for example, the eastern tropical South Pacific (Fernandez et al. 2011) and the Southern Californian Bight (Hamersley et al. 2011). Hence, heterotrophic nitrogen fixation may constitute an overlooked component of the nitrogen cycle not only in the Baltic Sea but also in other oceans. Additional spatial and temporal studies are now warranted to deepen our knowledge on nitrogen fixation and, in particular, on the variety of microbial nif gene operators in Baltic Sea waters, besides the well-known photoautotrophic cyanobacteria assumed to dominate.

METAGENOMICS IN BALTIC SEA MONITORING
HTS-based methods have huge capacity to provide detailed and all-encompassing information on microbial identity and potential function. In turn, this creates great potential for sequencing-based monitoring programs. Major advantages include the improved accessibility (i) to a widened coverage of uncultured organisms, (ii) to functional genes targeting specific metabolic processes of relevance for understanding ecosystem processes, (iii) to small bacteria, eukaryotic phytoplankton, and viruses, i.e., microbes lacking distinct morphologies, and finally, (iv) to a more consistent taxonomic identification. Efficient monitoring systems are characterized by continuous sampling and rapid handling, processing, and analyses of samples. Today, available HTS techniques enable simultaneous analyses of thousands of microbial samples with sufficient sequencing depths to reliably capture taxonomic diversity (Caporaso et al. 2011).
Initiatives to introduce genomic/metagenomic analyses in global monitoring programs and observatories are well underway, although not yet a standard (Bourlat et al. 2013;Davies et al. 2014). In an overview of the use of genomic tools in monitoring programs, Bourlat et al. (2013) identified thirteen indicators for qualitative descriptors from the ''Marine Strategy Framework Directive'' (MSFD, 2008/56/ EC), for which genomic tools can be implemented (Fig. 1). The descriptor categories include biological diversity, nonindigenous species, food webs, human-induced eutrophication, and seafloor integrity, all of which are of great relevance in assessing the environmental status of the Baltic Sea. More specific examples of potential targets in a Baltic Sea metagenomics-based monitoring program include genes and organisms of importance for eutrophication and nutrient cycles; with processes in focus encompassing, e.g., photosynthesis, nitrification/denitrification, nitrogen fixation, and phosphate uptake/metabolism. Other sets of target organisms/genes may be related to pollution (e.g, biodegradation of organic pollutants), or to public health concern and ''early-warning systems,'' including toxin producing microorganisms/toxin genes, as well as pathogens (e.g., Vibrio) and pathogenicity genes. Yet another example of microbe/gene sets is those involved in vitamin production in view of the ongoing thiamine (B1) deficiency in higher Baltic Sea organisms (see e.g., Balk et al. 2009). It should, however, be pointed out that while attempts have been made in defining sets of genes that can be used as indicators of environmental perturbations (see Yergeau et al. 2007;Bengtsson-Palme et al. 2014), the field still requires intensive research. Understanding of the relevant reporter genes is necessary for efficient use of metagenomics for monitoring purposes.
Phytoplankton is since long included in environmental monitoring programs covering the Baltic Sea water body, and methods for sampling and identification were standardized in 1991 through the establishment of the HEL-COM phytoplankton expert group. This program, like most others, relies on a morphology-based identification of phytoplankton (light microscopy). However, these practices require considerable work efforts and cover a limited number of samples. Many phytoplankton also lack obvious morphological characteristics needed for identification. The advancement of HTS technologies therefore appears as a promising alternative (or complement) to morphologybased identification. A comparative analysis of a genetic (16S rRNA gene) and a morphology-based identification of phytoplankton revealed considerable discrepancies. For instance, Euglenophyta and Heterokonta were less frequently identified by the sequence-based approach while cyanobacteria were more frequently identified (Eiler et al. 2013). A similar comparison between metagenomic identifications ) and conventional monitoring data publicly available (SMHI) shows similar discrepancies (Fig. 2). Even though the datasets are not directly comparable (e.g., not based on the same samples), the difference in cyanobacterial, diatom, and green algae identification and abundances are worth noting. These discrepancies are likely indicative of inherent methodological issues for both methods. For the morphologybased identification, these include, for example, cell preservation and human biased microscopic identification, while for the sequence-based method, they relate to DNA/ RNA nucleic acid extraction, primer biases, library preparation sequencing depth, and the assembly of short DNA sequences (Gomez-Alvarez et al. 2009;Niu et al. 2010;Schmieder and Edwards 2011). However, the perhaps largest challenge for a sequence-based monitoring program is the limited availability of sequenced reference strains. Metagenomic sequencing projects generate vast amounts of data, but more than half of the reads may end up as ''unclassified'' due to lack of such reference (genome/genetic) material. One recently introduced way of Fig. 1 Metagenomics in identifying and monitoring of microbes in the Baltic Sea. A schematic flowchart of the metagenomic approach used in the MiMeBS program and its potential integration in monitoring programs for the Baltic Sea. Criteria for which genomic methods can be used to assess environmental status were derived from the ''Marine Strategy Framework Directive'' (Bourlat et al. 2013) resolving these issues is the above-discussed assembly or ''binning'' of genomes using metagenomic sequences (Herlemann et al. 2013;Alneberg et al. 2014;Nielsen et al. 2014). Of specific interest is also genome-sequencing projects focusing on maximizing phylogenetic coverage (Wu et al. 2009;Shih et al. 2013) and accessing the ''rare microbial biosphere'' (Dini-Andreote et al. 2012). For eukaryotes with larger genomes, DNA barcoding and metabarcoding are still likely more realistic options for monitoring purposes than both genome sequencing and metagenomics. In the barcoding approach, signature DNA sequences are collected from type-organisms that are either cultured or documented, e.g., by micrographs (Pawlowski et al. 2012). Barcoding initiatives furthermore ensure that sequence-based monitoring data can be harmonized with and constitute a direct continuation of, long-term morphology-based data already at hand. Novel sequencing systems are currently being developed and evaluated for monitoring purposes in various marine ecosystems, and combined with innovative automated sampling devices such as real-time water monitoring buoys and environmental sample processors (Preston et al. 2009;Ottesen et al. 2011), these are promising tools for both expanding our knowledge and the protection of microbial life. These devices will be particularly important considering ongoing global warming, likely to negatively affect services delivered by natural aquatic ecosystems (Worm et al. 2006;White et al. 2012). Climate change predictions indicate that life in the Baltic Sea will be exposed to more drastic negative effects than those expected in global oceans (Meier et al. 2012), stressing the need for improved sustainable management practices for this body of waters. However, even in a global perspective, research efforts have only recently started to target consequences of such changes for Fig. 2 Comparison of phytoplankton and metazoan classifications in environmental samples via genetic (metagenomic) and microscopy-based methods. Samples starting with ''GS'' represent metagenomic sequencing where classifications were made using similarity searches of proteincoding genes against reference databases. The remaining samples represent microscopy-based classifications available in the SMHI database of environmental parameters (www.smhi.se). SMHI-sites closest to and within 50 km of the ''GS'' sampling locations were identified and are shown within the same-shaded area as their nearest metagenomic samples with distances shown in parentheses. Abbreviated SMHI-sites are as follows: N.mal.fj Nordmalingsfjärden, 1, BY31 BY31 LANDSORTSDJ, BROFJ. STRETU. BROFJORDEN/STRETUDDEN microbial life and hence on the global biogeochemical cycling of nutrients. Expanded genomics-based monitoring programs are therefore as urgent prerequisites for the Baltic Sea as for oceans and other water bodies.

CONCLUSION
The ability of an organism to survive through long-term and/or rapid changes in the environment is determined by its genetic repertoire and capacity to adapt physiologically. While metazoans are restricted by long generation times, genetic adaptations in bacteria can be fast and substantial. The steep physicochemical gradients and geographic isolation in the Baltic Sea pose challenging prerequisites for most organisms; however, fast growth and small microbial genomes enable drastic genetic modifications as a response to this variable environment. In addition to potentially profound effects of the microbial biodiversity on productivity and nutrient retention, the large pool of genetic diversity in Baltic Sea microbes discovered through HTS projects provides a valuable resource for resilience. In spite of the documented microbial diversity and functional potential in the Baltic Sea metagenomic datasets generated, many microbial-driven biochemical processes, ecosystem interactions and environmental adaptations remain insufficiently investigated. The studies reviewed here, based on microbial metagenomics, create a necessary base-line for monitoring the effects of a changing environment. Still, in order to develop efficient tools for monitoring programs, additional vertical, horizontal, and seasonal sampling and analyses of microbes in the Baltic Sea are required. Information obtained can then be used to create cost-effective screenings of, e.g., key-players in biogeochemical cycles, pathogenic organisms, and important components of Baltic food webs. This will in turn ensure a scientifically sound, knowledge-based management of the Baltic Sea, and its organismal resources in the future.