Habitability analyses of aquatic bacteria
Habitability is defined as an ability of an organism to inhabit different environments. Habitability of organisms, however, cannot be inferred from analyses such as a whole genome or community structures. A recently developed database, the MetaMetaDB, gives us information from what kind of environments one particular 16S rRNA sequence data has ever been obtained, and thus enables us to infer the habitability of the bacterium in question. In order to check the applicability of this database to study the habitability of aquatic bacteria, samples collected at two Naka River stations, one estuarine station from Naka River Estuary, two coastal stations at Oarai in Ibaraki Prefecture, Japan and one station in the Kuroshio Current of the western North Pacific were examined. The phylotypes were tracked against the MetaMetaDB and it was reasonably found that the low-salinity stations were dominated by sequences with “freshwater and groundwater”, “human” and “wastewater” habitat identities, while the high-salinity stations were dominated by those with a “marine” identity. The phylotypes of low-salinity stations with a particular habitat identity were absent or rare in the high-salinity stations and vice versa. The MetaMetaDB also showed that sequences of Cyanobacteria or related phylogenetic groups may be present in the human gut, as well as the probable distribution of the relatives (ancestors/descendants/siblings) of some bacteria. These overall findings proved that the MetaMetaDB is useful as a new tool to infer microbial habitability and it gives us new information on the possible origin and ecology of microorganisms in the environments.
Keywords16S rRNA 454 Pyrosequencing MetaMetaDB Habitability Aquatic bacteria
For maintaining a population size in nature, bacteria have to multiply or inflow at a rate faster than that of the declining rate due to death (as a result of predation, bacteriolysis, etc.) or outflow from the environment. The ability of bacterial cells to adapt to an environment determines this rate and also the habitability, which means the ability to inhabit different environments. The information on habitability is important to understand how apparent community structures in the environment are formed and also how they may vary depending on how they respond to the predominant conditions. However, it is difficult to assume habitability of bacteria by widely used genetic or physiological approaches.
The development of molecular techniques has made it possible to clarify bacterial community structures without cultivation (Acinas et al. 1997; DeLong et al. 1993; Giovannoni et al. 1995; Hiorns et al. 1997). For bacterial community structures analyses, 16S rRNA gene sequences are used for the taxonomical assignment because of the adequate substitution rate to differentiate species, suitable gene size and accumulation of massive data sets in databases, such as the Greengenes (DeSantis et al. 2006), the Ribosomal Database Project (Cole et al. 2009) and SILVA (Quast et al. 2013). Most of them are readily accessible for comparative works. A single database, however, does not offer any data for comparative works among different environments, because each database usually comprises a single environment, such as terrestrial, marine or human. If those databases are somehow combined and collectively used, it is possible to search from which environments one particular sequence has ever been detected and recorded. This offers information of habitability.
The recently developed MetaMetaDB (http://mmdb.aori.u-tokyo.ac.jp/; Yang and Iwasaki 2014) contains data set of 16S rRNA sequences derived from 454 platforms in the DDBJ Sequence Read Archive (DRA). It offers environmental categories that indicate in what kind of environments each sequence was obtained and recorded. Based on this recorded information, a particular sequence can be classified into its most probable habitat, whose variety in a particular sample or in a taxonomic group can be termed as habitability. It is noteworthy that the lack of the record doesn’t mean the absence of the corresponding sequence, and that there may be some biases of habitable environments depending on the relative amount of deposited sequence data. Nevertheless, we believed that this can be a new tool to infer habitability of prokaryotic organisms in natural environments. Hiraoka et al. (Hiraoka et al. 2016) checked the habitability of two soil prokaryotic communities by the MetaMetaDB and showed that the soil affected by the tsunami in 2011 tended to contain more sequences of marine habitat compared with the unaffected soil. More investigations are, however, required to confirm the reliability or usefulness of the MetaMetaDB to prokaryotic populations in nature.
Coastal areas are under the influence of terrestrial, freshwater and offshore marine environments. Bacterial community structures are formed as a result of mixing of those different environments (Crump et al. 1999; Crump et al. 2004). We first hypothesized that salinity is a major factor controlling community structures, second, that the application of the MetaMetaDB may clarify, at least in part, the mixing of freshwater and seawater, and third, that the habitability of those attached to particles and in free-living states are different. Therefore, we collected samples from the Naka River, Naka River Estuary, Oarai coasts (both in Ibaraki Prefecture, Japan) and the Kuroshio Current in the western North Pacific Ocean. For separating particle-associated fraction (PA) and free-living (FL) fraction, seawater sample was filtrated through two filters of different pore size. The purpose of this research was to examine the applicability of the MetaMetaDB to examine the habitability of microbial communities in aquatic environments. We assumed that this database will offer new information on the origin or evolutionary processes of bacterial groups that have been overlooked.
2 Materials and methods
2.1 Sampling sites and sample collection
Summary of different environmental parameters of the sampling stations
Air temp. (°C)
Water temp. (°C)
Total cell count (× 105 cells mL−1)
Low-salinity stations (salinity 0.5–5.0)
36º35′59″N & 140º55′67″E
36º34′40″N & 140º57′63″E
36º33′56″N & 140º59′06″E
High-salinity stations (salinity 32–35)
Oarai sea side
36°31′74″N & 140°59′20″E
Oarai port side
36°30′99″N & 140°58′46″E
36º20′49″N & 143º 58′38″E
36º20′53″N & 143º 58′29″E
2.2 Filtration and sample preparation
For the river, estuary and coastal samples, 10 L of water were collected in a sterilized screw-capped plastic bag and carried back to the laboratory in an ice box within 2–3 h after sampling. Two litters of the water samples were filtered through a 0.22-μm pore-sized Sterivex-GP pressure filter unit (Merck, Billerica, MA, USA) for the total (T) bacterial fraction. Another 2 L of water samples were pre-filtered through a 47-mm, 3.0-μm pore-sized Nuclepor polycarbonate track-etched membrane filter (GE Healthcare Life Science, Chiyoda-ku, Tokyo, Japan) for particle-associated (PA) fractions and then through a 0.22-μm pore-sized sterivex filter unit for free-living (FL) bacterial fractions. The samples from the R/V Hakuho Maru were collected by using a Niskin bottle and filtered within 1–2 h after collection onboard. The seawater samples were treated as is stated above. The membrane filters and the sterivex cartridge filters were stored at −20 or −80 °C until further processing.
2.3 Environmental parameters and total cell count
The air and water temperature of the samples from river, estuary and the coast were measured by a mercurial thermometer. Their salinity was determined by a refractometer (IS/Mill-E, As One, ATAGO, Japan). For cruise samples, salinity, water temperature and depth were obtained by using an SBE 911 plus CTD system (Sea-Bird Electronics, Inc., Washington DC, USA). To enumerate the bacterial abundance, the collected water samples were fixed with formalin (2% final concentration) and stored at 4°C in dark until enumeration. One mL sample was filtered onto a 25-mm2, 0.2-µm pore-sized Isopore™ membrane filter (Merck Millipore Ltd., Tullagreen, Carrigtwohill Co. Cork, Ireland), stained with DAPI (4′, 6-diamidino-2-phenylindole) mix solution [(5.5 parts Citiflour (Citiflour), 1 part Vectasheild (Vector Laboratories) and 0.5 parts phosphate-buffered saline (PBS), with DAPI (final concentration 2 μg ml−1)] and examined under an Olympus BX-51 epifluorescence microscope (Olympus Opticals, Tokyo, Japan; Porter and Feig 1980). Approximately 40–60 images were taken from each filter, and total prokaryotic cells in 10 randomly selected images were counted. The total count was calculated as the average of triplicate samples.
2.4 DNA extraction
DNA was extracted by ChargeSwitch Forensic DNA Purification Kits (Invitrogen™, Carlsbad, CA, USA) with slight modifications, using ZircoPrep Mini (FastGene™, Nippon Genetics Co. Ltd., Bunkyo-ku, Tokyo, Japan) for bead beating prior to the lysis process. Bead beating was done using a Micro Smash (MS-100R, Tomy Medico., Ltd., Tokyo, Japan) at 5000 rpm and 4°C for 30 s for each filter with great care to avoid contamination. The extracted DNA was then cleaned with NucleoSpin gDNA Clean-up kit (MACHEREY–NAGEL GmbH & Co. KG, Neumann-Neander-Str., Düren, Germany) according to the manufacturer's protocol and stored at −30 °C until the following treatments.
2.5 Bacterial 16S rRNA gene amplification and pyrosequencing
The hypervariable V1–V3 regions of bacterial 16S rRNA gene was amplified by polymerase chain reaction (PCR) using the forward primer 27F with multiple identifiers (MIDs): 5′-CCATCTCATCCCTGCGTGTCTCCGACTCAGXXXXXXXXXXAGAGTTTGATCMTGGCTCAG-3′ and the reverse primer 519R with adaptor: 5′-CCTATCCCCTGTGTGCCTTGGCAGTCTCAG(GWATTACCGCGGCKGCTG)-3′; where Xs represents the sample-specific multiplex identifier-MID (Kim et al. 2011). PCR reactions were carried out in 20 μL of the mixture consisted of 2 μL of DNA template, 13.1 μL of molecular biological grade double-distilled water, 0.6 μL of each primer at 5 μM, 2 μL of 10 × TaKaRa Ex Taq Buffer, 1.6 μL of TaKaRa dNTP mixture (2.5 mM each), and 0.1 μL of XUnits TaKaRa Ex Taq HS Polymerase (TaKaRa, Kusatsu, Shiga, Japan) in triplicate. Thermal cycling was carried out with the following conditions: initial denaturation at 94 °C for 4 min, 25 cycles of the denaturation at 98 °C for 10 s, annealing at 55 °C for 30 s and elongation at 72 °C for 1 min, and final elongation at 72 °C for 10 min. After amplification, the presence of the desired length of the partial 16S rRNA gene was confirmed by agarose gel electrophoresis and any sort of contamination was carefully verified by observing the bands of the triplicates of the same samples. After confirming the length, PCR products were purified and normalized using Agencourt AMPure XP (Beckman Coulter Inc., Beverly, MA, USA) according to the guidance of the 454 Sequencing Amplicon Library Preparation Method Manual (GS Junior Titanium Series 2012). The purified PCR products were then quantified using Quant-iT™ PicoGreen® dsDNA Assay Kit (Thermo Fisher Scientific, Eugene, OR, USA). After quantification, the PCR products were sequenced using 454 GS Junior sequencer (Roche Diagnostics, 454 Life Sciences Corp., Branford, CT, USA) at the Atmosphere and Ocean Research Institute, the University of Tokyo (Kashiwa, Chiba, Japan), according to the manufacturer’s protocol for the 454 GS Junior Titanium Series.
2.5.1 Sequence data accession number
The raw sequence data were deposited in the DDBJ Sequence Read Archive databases under the accession number DRA004565.
2.6 Sequence analyses
The subsequent analysis, quality checking and arrangement were done using the open-sourced MOTHUR program (Schloss et al. 2009) following the guidelines available in the operation manual for the 454 (http://www.mothur.org/wiki/454_SOP). The separately run and obtained data files were fused together after removing the tags and primer sequences, and after trimming (qwindowaverage = 35, qwindowsize = 50, minlength = 200). For the next step, the data set was made more workable by selection of only the unique sequences. Then, similar sequences were aligned using the “silva.nr_v119.align” file as reference (silva.nr_v119; Pruesse et al. 2007). Sequencing errors were further reduced by screening, filtering and de-noising through the pre-cluster method (Huse et al. 2010). The chimeras were checked and removed using chimera.uchime (Edgar et al. 2011). In order to improve the data quality, sequences were subsequently classified using the Ribosomal Database Project (Maidak et al. 1996) reference files, and the inactive components, such as the chloroplast, mitochondria and organelles affiliated with “former” bacterial sequences, were removed using the “remove.lineage” command from our dataset. The qualified high-quality sequences were used to generate the distance matrix and for assigning of clusters to operational taxonomic units (OTUs) at a 97% identity level (Schloss and Westcott 2011). A representative sequence from every OTU was used for classification by running the MOTHUR program based on the “silva.nr_v119.tax” file (silva.nr_v119). To standardize the number of reads sequenced between samples, they were randomly re-sampled according to the sample with the fewest reads (2838 reads) using the MOTHUR program; this was done based on OTU files clustered at a 0.03 cut-off level.
2.7 Tracking the habitability of the identified representative sequences
To test the habitability, the obtained 16S rRNA gene sequences were tracked against the MetaMetaDB and gathered information on their habitability above 97% (species level), 95% (genus level), 90% (family level), 85% (order level) and 80% (class level) of identity as default output of the database according to Kirchman (Kirchman 2012). In order to verify the robustness and any sort of biases, the same sequences were verified against both the latest (data by November 6, 2014) and the previous version (data by March 19, 2014). The latest version of MetaMetaDB contains 2,949,852 representative 16S rRNA sequences from 61 diverse environments while the previous one contained 2,737,833 representative 16S rRNA sequences (http://mmdb.aori.u-tokyo.ac.jp/download.html; Yang and Iwasaki 2014).
3.1 Environmental parameters and total cell count
The sampling locations, depth, air temperature, water temperature, salinity and total cell count are shown in Table 1. Salinity showed a typical gradient from freshwater (about 0.5 in the river stations), brackish water (about 5.0 in the estuary station to about 32 in the Oarai sea side station) to off-shore marine water in Kuroshio waters (about 35). The highest total cell count was at the Oarai port side station and the lowest in Kuroshio surface waters. The counts for two riverine stations, one estuarine and one sea side station were similar (Table 1).
3.2 Bacterial community structures
3.3 Habitability assessment
Percentage of phylotypes assigned to habitability at the different level of identity with their contribution to the relative abundance
Level of identify
Number of phylotypes assigned out of 11,178
% of phylotypes assigned
% contribution to the relative abundance by the assigned phylotypes
At low-salinity stations
At high-salinity stations
Habitability of the phylotypes at 85% (order) level of identity
Mostly adapted habitats (MHIs)
Contributions at low-salinity stations (%)
Contributions at high-salinity stations (%)
Freshwater and groundwater (14–100%)
Oil production facilities (22–100%)
Plants and roots (17–100%)
Sediments and soil (18–100%)
Others (ant, bioreactors, compost, epibiont, food, gut, ice, pig; 23–100%)
3.4 Major groups assigned to each habitat identity
At an 85% level of identity, more than 20% of the sequences in FL and PA fractions from Kuroshio surface water was unexpectedly assigned to the “human”, more specifically, “human gut” (Supplementary Table 6). Those sequences were mostly shared by the order “Cyanobacteria subsection I” of the phylum Cyanobacteria. The same group was also found to contribute about 4% of the total in the FL fraction of Kuroshio chlorophyll maximum water. In other sampling points, this group was absent or scarce (Supplementary Table 6).
The study was conducted to examine the applicability of the MetaMetaDB to examine the habitability of the bacterial communities in river, estuarine, coastal and offshore environments. In order to confirm the applicability, we assumed that the following conditions should be met. First, as for samples obtained from different salinity, the MetaMetaDB gives reasonable explanations based on the salinity differences. Second, the MetaMetaDB shows the influence of terrestrial or anthropogenic influences on some coastal samples, but not on offshore samples. Third, as for some phylogenetic groups, the MetaMetaDB gives information which is overlooked by ordinary community structure analyses. For the first, at a 97% level of identity, habitability showed clear and expected differences between the low- and high-salinity stations. The bacterial communities at two rivers and one estuarine water stations were mostly occupied by the phylotypes from “freshwater and groundwater” and “human”, while the high-salinity coastal stations were mostly occupied by the phylotypes assigned as “marine” and “plants and roots”. The similar tendency was also found at the 85% (order level) level of identity. For the second, at a 97% level, sequences assigned to “freshwater and groundwater” or “human” were present, especially at the sea side of Oarai stations, but were absent at the Kuroshio area. At an 85% level, sequences assigned to “sediment and soil” were seen at Oarai stations, whereas those shared only a minor portion in the Kuroshio area. It is noteworthy that the Oarai sea side sometimes receives freshwater from the Naka River (Matsu-ura et al. 2010; Uda 2010), which might bring some phylotypes assigned to the “freshwater and groundwater”. For the third, we found two examples, Bacteroidetes and sequences assigned to the “human” in Kuroshio water (see below).
Among the Bacteroidetes phylotypes assigned to “freshwater and groundwater” and “human” at a 97% level, genus Flavobacterium (Flavobacteriales) contributes predominantly. The members of genus Flavobacterium (Bergey et al. 1923) are widely distributed in soil and freshwater habitats, and some of them are pathogenic for fish (Bernardet et al. 1996; Kim et al. 2009; Wang et al. 2006). The phylotypes assigned to “marine” contain the NS5 marine group (Flavobacteriales), NS9 marine group (Flavobacteriales), genus Polaribacter (Flavobacteriales) and genus Ulvibacter (Flavobacteriales). The genus Polaribacter was first isolated from a polar marine environment (Gosink et al. 1998) and since then, it has been found in various regions, including coastal areas of Japan (Fukui et al. 2013; Teeling et al. 2012). The genus Ulvibacter was first isolated from green alga (Nedashkovskaya et al. 2004) as well as from coastal sea water (Baek et al. 2014). The genus Marinoscillum was first isolated from a marine sponge (Seo et al. 2009). The phylotypes assigned as “plants and roots” were mostly occupied by the NS4 and genus Marinoscillum (Cytophagales) of the Bacteroidetes phylum that was only present in the Kuroshio waters. Thus, the MetaMetaDB showed that there are multiple members with different habitat identities in the Bacteroidetes phylum, which was overlooked by ordinary community structure analyses.
The second case of our unexpected result was the presence of sequences with habitat identity of “human” in Kuroshio surface water. Some bacterial sequences with habitat identity of “human” were noticed. Di Rienzi et al. (Di Rienzi et al. 2013) tried the whole genome reconstruction of human feces and proposed a new candidate phylum, Melainabacteria, a sibling to Cyanobacteria. The Melainabacteria are non-photosynthetic, anaerobic and obligately fermentative, and are also present in soil and aquatic environments. Although further detailed analyses are required, there is a possibility that members belonging to this group were present in Kuroshio samples.
Previous studies reported that the phylum Bacteroidetes is widely distributed in both freshwater and marine habitats (Amaral-Zettler et al. 2010; Glockner et al. 1999; Kirchman 2002; Kirchman 2012). The classes Alphaproteobacteria, (DeLong et al. 2006; Fuhrman and Davis 1997; Lopez-Garcia et al. 2001; Pham et al. 2008) and Gammaproteobacteria are generally the two most dominant groups in high-salinity or marine environments, but are relatively rare in freshwater or low-salinity waters. On the contrary, the class Betaproteobacteria and the phylum Actinobacteria are more abundant in freshwater habitats, but less so in marine habitats (Cottrell and Kirchman 2003; Crump et al. 1999; Herlemann et al. 2011; Kan et al. 2008; Kirchman et al. 2005; Kirchman 2012; Murray et al. 1996). We have also found that the Kuroshio water was dominated by the phyla Cyanobacteria in both PA and FL fractions, and Verrucomicrobia, especially in PA fractions of the chlorophyll maximum layer. The abundance of the phylum Cyanobacteria in the Kuroshio area were reported in previous studies (Juan et al. 2011; Kataoka et al. 2009). There were also indications that the phylum Verrucomicrobia is significantly abundant in PA fractions to compare to the FL in some areas (Crespo et al. 2013; Freitas et al. 2012) and more abundant in the chlorophyll maximum layer (Crespo et al. 2013). Thus, the present results on bacterial community structures are consistent with typical distribution patterns of the phyla Bacteroidetes, Proteobacteria, Actinobacteria, Cyanobacteria and Verrucomicrobia.
The results did not show clear differences in both community structure and habitability among PA and FL fractions of bacteria, with the only exception in the Kuroshio chlorophyll maximum layer, suggesting the exchanges of populations between FL and PA states occur at a reasonably fast rate. However, this doesn’t exclude the possibility that there may be some differences at a finer spatiotemporal scale.
In conclusion, in order to examine the applicability of the MetaMetaDB to examine habitability, the bacterial community structures in environments with different salinity were examined. The MetaMetaDB showed that the low-salinity stations were dominated by the sequences with “freshwater and groundwater”, “human” and “wastewater” habitabilities, while the high-salinity stations were dominated by those with “marine” habitat identity. These overall findings proved that the MetaMetaDB is useful as a new tool to infer microbial habitability.
- Bergey DH, Harrison FC, Breed RS, Hammer BW, Huntoon FM (eds) (1923) Bergey’s manual of determinative bacteriology. Williams and Wilkins, BaltimoreGoogle Scholar
- Bernardet JF, Segers P, Vancanneyt M, Berthe F, Kersters K, Vandamme P (1996) Cutting a Gordian knot: emended classification and description of the genus Flavobacterium, emended description of the family Flavobacteriaceae, and proposal of Flavobacterium hydatis nom. nov. (basonym, Cytophaga aquatilis Strohl and Tait 1978). Int J Syst Bacteriol 46:128–148CrossRefGoogle Scholar
- Crump BC, Armbrust EV, Baross JA (1999) Phylogenetic analysis of particle-attached and free-living bacterial communities in the Columbia River, its estuary, and the adjacent coastal ocean. Appl Environ Microbiol 65:3192–3204Google Scholar
- Giovannoni SJ, Mullins TD, Field KG (1995) Microbial diversity in oceanic systems: rRNA approaches to the study of unculturable microbes. NATO ASI Ser Ser G Ecol Sci 38:217–248Google Scholar
- Glockner FO, Fuchs BM, Amann R (1999) Bacterioplankton compositions of lakes and oceans: a first comparison based on fluorescence in situ hybridization. Appl Environ Micro-biol 65:3721–3726Google Scholar
- Gosink JJ, Woese CR, Staley JT (1998) Polaribacter gen. nov., with three new species, P. irgensii sp. nov., P. franzmannii sp. nov. and P. filamentus sp. nov., gas vacuolate polar marine bacteria of the Cytophaga–Flavobacterium–Bacteroides group and reclassification of ‘Flectobacillus glomeratus’ as Polaribacter glomeratus comb. nov. Int J Syst Bacteriol 48:223–235CrossRefGoogle Scholar
- Hiorns WD, Methe BA, Nierzwicki-Bauer SA, Zehr JP (1997) Bacterial diversity in Adirondack Mountain lakes as revealed by 16S rRNA gene sequences. Appl Environ Microbiol 63:2957–2960Google Scholar
- Juan L, Zhang Y, Junde D, Youshao W, Lei C, Jingbin F, Hongyan S, Dongxiao W, Si Z (2011) Spatial variation of bacterial community composition near the Luzon strait assessed by polymerase chain reaction-denaturing gradient gel electrophoresis (PCR-DGGE) and multivariate analyses. Afr J Biotechnol 10(74):16897–16908Google Scholar
- Kirchman DL (2002) The ecology of Cytophaga-Flavobacteria in aquatic environments FEMS Microbiol. Ecol. 39:91–100Google Scholar
- Kirchman DL (2012) Processes in microbial ecology. Oxford University Press, New YorkGoogle Scholar
- Matsu-ura T, Uda T, Kumada T, Sumiya M (2010) Sand accumulation in wave-shelter zone of Oharai Port and change in grain size of seabed materials on nearby coast. In: Proc 32nd ICCE, sediment 63:1–11. http://journals.tdl.org/ICCE/article/view/1077/pdf_179
- Murray AE, Hollibaugh JT, Orrego C (1996) Phylogenetic compositions of bacterioplankton from two California estuaries compared by denaturing gradient gel electrophoresis of 16S rDNA fragments. Appl Environ Microbiol 62:2676–2680Google Scholar
- Seo HS, Kwon KK, Yang SH, Lee HS, Bae SS, Lee JH, Kim SJ (2009) Marinoscillum gen. nov., a member of the family ‘Flexibacteraceae’, with Marinoscillum pacificum sp. nov. from a marine sponge and Marinoscillum furvescens nom. rev., comb. nov. Int J Syst Evol Microbiol 59:1204–1208CrossRefGoogle Scholar
- Uda T (2010) Impacts on sandy beach and habitat of Japanese hard clams due to construction of port breakwater. In: Inter Symposium on Integrated Coastal Management for Marine Biodiversity in Asia. Collective abstracts B-1, pp 28–34Google Scholar