Introduction

Chemoautotrophic bacteria form the base of the food-chain in deep sea hydrothermal vent ecosystems [1, 2]. Many of these chemoautotrophs live in highly integrated symbiotic associations with invertebrate hosts, such as mussels, clams and tube worms, which enables megafaunal communities to thrive in the otherwise uninhabitable vent ecosystem [3,4,5].

The mussel Bathymodiolus thermophilus , for example, a bivalve belonging to the family Mytilidae , densely populates the hydrothermal vent fields of the Galapagos Rift and of the East Pacific Rise between the latitudes 13 °N and 21 °S [6]. Although the animal’s food groove and digestive tract are reduced [6], B. thermophilus appears to be able to ingest and assimilate suspended particles by filter feeding [7].

The major part of the bivalve’s nutrition, however, is derived from its chemosynthetic symbionts [4, 8]. The sulfur-oxidizing bacteria live within specialized gill cells, so called bacteriocytes [9]. Provided with a steady supply of reduced sulfur from the vents, these symbionts synthesize organic compounds and thus feed their host [10, 11].

Investigations on the symbiont’s physiology have hitherto been limited by the inaccessibility of mussel samples and failure to culture the symbionts in vitro. Underlying metabolic pathways that facilitate the putative inter-exchange of nutrients between the symbiotic partners therefore remain unexplored. However, culture-independent methods, such as direct genomic, transcriptomic or proteomic analyses of symbiont-containing tissue or of enriched symbiont fractions have provided useful physiological information about various uncultured marine symbionts in the past [12,13,14,15,16,17]. In this study we used symbiont-enriched preparations from B. thermophilus gill tissue to assemble the first draft genome of the B. thermophilus symbiont in order to gain preliminary insights into its metabolic potential.

Organism information

Classification and features

B. thermophilus symbiont cells are coccoid or rod-shaped (Fig. 1). In electron micrographs, they typically appear as roundish forms, whose central region is light or transparent (looking “empty”), while the outermost regions of the cytoplasm are darker and more structured (Fig. 1 and [9]). Like most sulfur-oxidizing (thiotrophic) bivalve symbionts [4, 18], the B. thermophilus symbiont has a Gram-negative cell wall. With a diameter of 0.3–0.5 μm, B. thermophilus symbiont cells are of similar size as thiotrophic symbionts from other Bathymodiolus host species [19,20,21,22], and notably smaller than sulfur-oxidizing symbionts from other invertebrate hosts [4, 23]. In the host tissue, the symbionts are usually enveloped in large vacuoles. Groups of up to 20 symbionts within a single host vacuole have previously been reported by Fisher and colleagues [9]. Imaging of purified symbiont fractions from homogenized B. thermophilus gill tissue revealed, besides a large number of free symbiont cells, some intact vacuoles encompassing multiple symbionts (Fig. 1).

Fig. 1
figure 1

Transmission electron micrographs of Candidatus Thioglobus thermophilus. B. thermophilus gill tissue was homogenized in a glass tissue grinder and subjected to crude density gradient centrifugation using Histodenz™ gradient medium. Subsamples were taken from two visible bands and fixed for electron microscopy (a and b). Both subsamples contained numerous free symbiont cells (S) as well as some intact host vacuoles (V) containing several symbiont cells, besides various other cellular components and host tissue debris. L: Lipid drop or mucus. Scale bar: 5 μm. Electron microscopy method details: samples were fixed in a) 1% glutaraldehyde, 2% paraformaldehyde in IBS (imidazole-buffered saline; 0.49 M NaCl, 30 mM MgSO4*7H2O, 11 mM CaCl2*2H2O, 3 mM KCl, 50 mM imidazole) and b) in 2.5% glutaraldehyde, 1.25% paraformaldehyde in IBS. After embedding in low-gelling agarose and postfixation in 1% osmium tetroxide in cacodylate buffer (0.1 M cacodylate; pH 7.0), samples were dehydrated in a graded ethanol series (30 to 100%) and embedded in a mixture of Epon and Spurr (1:2). Sections were cut on an ultramicrotome (Reichert Ultracut, Leica UK Ltd., Milton Keynes, UK), stained with 4% aqueous uranyl acetate for 5 min followed by lead citrate for 1 min and analyzed with a transmission electron microscope LEO 906 (Zeiss, Oberkochen, Germany)

B. thermophilus symbionts reside intracellularly in bacteriocytes in their host’s gill tissue. Unlike some other Bathymodiolus species, such as B. azoricus that maintains a dual symbiosis with both sulfur-oxidizing and methane-oxidizing bacteria [24], B. thermophilus hosts only one type of bacterial endosymbionts. Based on 16S rRNA gene similarity [25], this sulfur-oxidizing symbiont population in B. thermophilus belongs to a single phylotype.

The B. thermophilus symbiont is a member of the Gammaproteobacteria (NCBI taxonomy ID 2360). It is closely related to symbionts of other Bathymodiolus species, and more distantly related to symbionts of other invertebrate hosts and to free-living Gammaproteobacteria from various marine habitats [26]. The B. thermophilus symbiont falls in a well-supported clade consisting of symbionts of other mytilid and vesicomyid bivalves and free-living gammaproteobacterial clones from marine vents and other submarine volcanic sites as shown in Fig. 2. Its closest relatives are the ‘ Bathymodiolus aff. Thermophilus thioautotrophic gill symbiont’ from 32 °N EPR (NCBI taxonomy ID 363574; 99.85% similarity on the 16S rRNA level) and the Bathymodiolus brooksi symbiont from the Gulf of Mexico (NCBI taxonomy ID 377144; 99.53% similarity). According to our analysis, the B. thermophilus symbiont is only remotely similar to the sulfur-oxidizing symbionts of deep-sea vestimentiferan tube worms (90% 16S rRNA similarity) and of shallow water lucinid clams (87–90% similarity, see Fig. 2).

Fig. 2
figure 2

Phylogenetic tree of Candidatus Thioglobus thermophilus and related free-living and host-associated sulfur oxidizers. Ca. T. thermophilus, the thiotrophic symbiont of Bathymodiolus thermophilus, is displayed in bold. The tree was inferred from closely related 16S rRNA gene sequences obtained from the SILVA database using the SILVA Incremental Aligner (SINA) [51] and was estimated with the 16S rRNA sequence of 46 bacteria. The final alignment covered 1138 nucleotides. Sequence alignment and phylogenetic analysis were performed using the MEGA7 software tool [52]. The phylogenetic tree was constructed using the Maximum Likelihood method based on the Tamura-Nei model implemented in MEGA7 [53]. Branch bootstrap support values were calculated using 1000 replicates and are displayed as circles (black: ≥ 90%, white: ≥ 60%). For the sake of clarity some organisms were merged into groups (wedges): auncultured clones (KC682721, KC682765, JQ678401, AB193934); bwhale fall symbionts (HE814589, HE814588, HE814591 HE814585); cuncultured clones (FM246509, FM246513); duncultured clones (JQ678344, JQ678392); eMytilidae symbionts (AM503921, AM503923); fVesicomyidae symbionts (EU403432, EU403431, CP000488* 1081274–1,082,807, AP009247* 948400–949,934); gLucinidae symbionts (X84979, M99448, M90415); htube worm symbionts (NZ_AFOC01000137* 503–2033, DQ660821, NZ_AFZB01000059* 4132–5662). The lucinid clam symbionts, the vestimentiferan tube worm symbionts, and the free-living Thiomicrospira crunogena XCL-2 were included as outgroup. Branches that are not highlighted by colors represent free-living relatives. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. *these NCBI accession numbers refer to whole genome submissions and not to individually submitted 16S rRNA gene sequences (start and stop positions of the 16S rRNA gene are given after the asterisk). JdFR: Juan de Fuca Ridge, EPR: East Pacific Rise, MAR: Mid-Atlantic Ridge, OMZ: oxygen minimum zone, MFZ: Mendocino Fracture Zone, SBB: Santa Barbara Basin, WH: Woods Hole

The B. thermophilus symbiont’s closest cultured relative is the free-living Candidatus Thioglobus autotrophica, whose genome was recently sequenced [27]. The metabolic properties of both bacteria appear to be highly similar, as predicted from their genomes. Like the B. thermophilus symbiont genome presented in this study, the Ca. T. autotrophica genome encodes an incomplete TCA cycle. The high degree of 16S rRNA gene sequence similarity between Ca. T. autotrophica and the B. thermophilus symbiont (95%), suggests that both belong to the same genus. We therefore propose the tentative name Candidatus Thioglobus thermophilus for the thiotrophic B. thermophilus symbiont.

A summary of key features of Ca. T. thermophilus is given in Table 1.

Table 1 Classification and general features of Candidatus Thioglobus thermophilus, the Bathymodiolus thermophilus gill endosymbiont, according to the MIGS recommendations [42]

Genome sequencing information

Genome project history

The genome of Candidatus Thioglobus thermophilus was sequenced to get a comprehensive insight into the metabolic potential of the bacterium. This project is part of a larger effort to compare the symbiont genomes from various Bathymodiolus species across different vent habitats in order to understand the possible effects of vent geochemistry in shaping host-symbiont evolution in Bathymodiolus. Sequencing and assembly of the symbiont genome were conducted at the Göttingen Genomics Laboratory (University of Göttingen, Germany) and at the Max-Planck-Institute of Marine Microbiology (Bremen, Germany), respectively. The sequences have been deposited in GenBank under the accession number MIQH00000000. A summary of the project information is shown in Table 2.

Table 2 Project information

Growth conditions and genomic DNA preparation

Symbionts for genome sequencing were isolated from one single B. thermophilus host individual, which was collected during the R/V Atlantis cruise AT26–10 in January 2014. The mussel was collected from a diffuse-flow vent at the Tica vent field on the East Pacific Rise at 9° 50.39′ N, 104° 17.49′ W by the remotely operated vehicle (ROV) Jason. After recovery, the animal was dissected on board the research vessel and gill tissue was removed and homogenized in 1× PBS buffer (Dulbecco’s Phosphate Buffered Saline, Sigma-Aldrich order no. D5773). The resulting homogenate was diluted with 1× PBS (ratio 1:3) and subjected to multiple centrifugation steps (differential pelleting): In a first centrifugation step (500 × g, 5 min, 4 °C in a tabletop centrifuge using a swing-out rotor), crude host tissue debris and host cell nuclei were removed from the homogenate. The supernatant was centrifuged again (step 2) as described above to pellet residual host nuclei. The new supernatant was now centrifuged at maximum speed (step 3), i.e. at 15,000 × g for 20 min at 4 °C using a fixed-angle rotor. The resulting pellet contained enriched bacterial cells and was immediately frozen at −80 °C until genomic DNA preparation.

Genomic DNA was isolated from the purified bacteria using the MasterPure DNA Purification Kit (Epicentre) as recommended by the manufacturer.

Genome sequencing and assembly

Sequencing of the B. thermophilus symbiont genome was performed at the Göttingen Genomics Laboratory using the Illumina Genome Analyzer II x. A Nextera shotgun library was generated for a 112 bp paired-end sequencing run. Sequencing resulted in 7,569,934 paired-end reads. Adaptors were removed from the reads, quality-trimmed (Q = 2) with BBDuk and error-corrected with BBnorm (V35, sourceforge.net/projects/bbmap). The resulting reads were assembled with IDBA-UD [28]. To bin the symbiont genome from the metagenome assembly, we used gbtools [29] based on GC content and sequencing coverage. The corrected reads were mapped against the symbiont genome bin with BBmap and reassembled with SPAdes v. 3.1.1 [30]. This assembly resulted in 1341 contigs longer than 200 bp (1281 scaffolds). The completeness and contamination of the genome was estimated with CheckM [31]. The CheckM test showed 96.98% completeness of the genome with 11.32% contamination and 81.40% strain heterogeneity.

Genome annotation

All scaffolds were annotated using NCBI’s prokaryotic genome annotation pipeline (https://www.ncbi.nlm.nih.gov/genome/annotation_prok/), which uses the gene caller GeneMarkS+ together with a similarity-based gene detection approach [32, 33]. Predicted proteins were assigned Clusters of Orthologous Groups numbers and Protein Families domains by querying their sequences against the COG Database and the Pfam database, respectively, at NCBI (ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/). Querying was done using the rpsblast application of the BLAST + −2.4.0 package with an E-value cutoff of 1 × 10−5 and 1 × 10−4, respectively, for COG and Pfam. To manually assign COG categories to the COG numbers returned by rpsblast, the COG category database was downloaded from the COG FTP server (ftp://ftp.ncbi.nih.gov/pub/COG/COG2014/data). For prediction of signal peptides the SignalP 4.1 Server [34], PECAS [35] and Phobius [36] were used. Transmembrane helices and CRISPR loci (CRISPR arrays) were predicted with TMHMM Server v. 2.0 [37] and the CRISPRFinder tool [38], respectively.

Genome properties

The properties of this genome are summarized in Table 3. The draft genome of the sulfur-oxidizing B. thermophilus symbiont contained 3,088,407 bp in 1281 scaffolds >200 bp. The average GC content was 37.7%. A total of 3097 genes were predicted, of which 3045 (98.3%) are predicted protein-encoding genes. The remaining 1.5% and 0.2%, respectively, consisted of RNA genes and pseudo genes. Of the protein-encoding genes, 54.5% and 65.2% were affiliated to COG- and Pfam-based functions, respectively. For an overview of predicted COG categories see Table 4.

Table 3 Genome statistics
Table 4 Number of genes associated with general COG functional categories

Insights from the genome sequence

Sulfur-oxidizing symbionts of Bathymodiolus species are assumed to be horizontally transmitted, i.e., they supposedly enter their bivalve hosts from a free-living bacterial population in the environment, rather than being transferred from one mussel generation to the next [39]. The idea of a putative free-living stage of the symbiont in the hydrothermal vent environment is in accordance with our genome analysis: Unlike some insect symbionts, which are obligatorily dependent on their hosts and have a diminished genome [40], the B. thermophilus symbiont genome (3.1 Mb in size, see below) is not reduced. With the exception of the tricarboxylic acid cycle, which lacks three enzymes (see below), all necessary pathways for a host-independent life-style appear to be complete in the B. thermophilus symbiont’s genome.

Energy generation

The B. thermophilus symbiont uses reduced sulfur compounds such as sulfide and thiosulfate as its major energy sources [10]. As predicted from the genome sequence, sulfide and thiosulfate are oxidized to sulfate via the rDSR-APS-Sat pathway and the Sox multienzyme-complex, respectively. Oxygen and nitrate are used as final electron acceptors. Complete gene sets for these pathways are present in the symbiont genome.

CO2 fixation and carbon metabolism

The B. thermophilus symbiont genome furthermore encodes a modified version of the CO2-fixing Calvin-Benson-Bassham cycle: while the genes for sedoheptulose-7-phosphatase and fructose-1,6-bisphosphatase are missing, a pyrophosphate-dependent 6-phosphofructokinase is encoded, which potentially replaces the two other functions (as also described for the endosymbionts of Calyptogena magnifica [12], Riftia pachyptila [13] and Olavius algarvensis [16]). The B. thermophilus symbiont’s TCA cycle is incomplete, as the enzyme 2-oxoglutarate dehydrogenase is missing. Moreover, homologs of the enzymes malate dehydrogenase and succinate dehydrogenase are also lacking, similar to what was reported for the thiotrophic B. azoricus symbiont [17].

Nitrogen metabolism

The B. thermophilus symbiont possesses genes for assimilatory nitrate reduction, i.e. for nitrogen uptake from nitrate. Its genome also encodes the Nar complex, a membrane-bound respiratory nitrate reductase necessary for respiratory reduction of nitrate, indicating that nitrate can be used as an alternative electron acceptor besides oxygen. Several membrane transporters for the uptake of nitrate, nitrite and ammonia are also encoded.

Immunity and cell surface interactions

Of the 3045 protein-coding genes, 10.74% are predicted to contain Pfam domains related to bacterial cell surface adhesion, such as bacterial Ig-like domain proteins and cadherins, and to putative toxins, such as pore-forming RTX and MARTX cytotoxins. Another 2.17% of the protein-coding genes were associated with immunity against phages (CRISPR-Cas, restriction modification system and the Abi toxin-antitoxin system). This elaborate presence of genes associated with pathogenicity and phage defense, typical of pathogens and bacteriophages, was also observed in the related thiotrophic B. azoricus symbiont [17, 41]. This particular feature of Bathymodiolus symbionts is surprising since the bacteria a) reside in shielded intracellular niches, b) are beneficial symbionts for their host, and c) are not related to any known pathogen [26, 41]. Moreover, approximately 1.71% of the protein-coding B. thermophilus genes belonged to several classes of pathogenic and digestive peptidases. Membrane transporters of type I and type II secretion systems, which transport toxins and folded exoproteins such as peptidases, are also encoded. Although their exact roles have not been determined as yet, we postulate that these pathogenicity-related genes may be involved in protecting the symbionts against pathogens or phages or even perform symbiosis-specific functions, such as symbiont attachment to the host or defense against the host’s immune system, as suggested previously [41].

Conclusions

Sequencing of the uncultured B. thermophilus symbiont’s genome allowed preliminary insights into its genomic characteristics and metabolic potential. Candidatus Thioglobus thermophilus appears to solely rely on sulfide and thiosulfate as energy sources, as genes for the oxidation of other reduced compounds were absent from its genome. The absence of three genes encoding essential TCA cycle enzymes, which was recently also reported for the thiotrophic B. azoricus symbiont [17], may suggest that these genes are consistently missing in Bathymodiolus symbionts. The unusual presence of a repertoire of genes associated with cell adhesion, toxin production and phage immunity in the non-pathogenic B. thermophilus symbiont may point to a symbiosis-specific beneficial role of these functions other than pathogen defense.