Genomic insights into temperature-dependent transcriptional responses of Kosmotoga olearia, a deep-biosphere bacterium that can grow from 20 to 79 °C

Temperature is one of the defining parameters of an ecological niche. Most organisms thrive within a temperature range that rarely exceeds ~30 °C, but the deep subsurface bacterium Kosmotoga olearia can grow over a temperature range of 59 °C (20–79 °C). To identify genes correlated with this flexible phenotype, we compared transcriptomes of K. olearia cultures grown at its optimal 65 °C to those at 30, 40, and 77 °C. The temperature treatments affected expression of 573 of 2224 K. olearia genes. Notably, this transcriptional response elicits re-modeling of the cellular membrane and changes in metabolism, with increased expression of genes involved in energy and carbohydrate metabolism at high temperatures and up-regulation of amino acid metabolism at lower temperatures. At sub-optimal temperatures, many transcriptional changes were similar to those observed in mesophilic bacteria at physiologically low temperatures, including up-regulation of typical cold stress genes and ribosomal proteins. Comparative genomic analysis of additional Thermotogae genomes indicates that one of K. olearia’s strategies for low-temperature growth is increased copy number of some typical cold response genes through duplication and/or lateral acquisition. At 77 °C one-third of the up-regulated genes are of hypothetical function, indicating that many features of high-temperature growth are unknown. Electronic supplementary material The online version of this article (doi:10.1007/s00792-017-0956-9) contains supplementary material, which is available to authorized users.

2 Therefore, equal volumes of a phenol-ethanol stop solution were added to the cultures prior to harvesting the cells (as described in Materials and Methods). This resulted in much lower expression of the cold shock-related gene (Kole_2064) at 65°C, high expression of expected metabolic genes (e.g., Kole_0379-0382), as well as very high expression of an alcohol dehydrogenase (Kole_0742) ( Fig. 3; Table S4). The latter is likely due to exposure of the culture to the phenol-ethanol stop solution at the beginning of RNA extraction, highlighting that a response is provoked by this cell killing/RNA isolation procedure as well. Also, the expression of a redox-sensing transcriptional repressor Rex (Kole_1558) was significantly lower in the 30°C cultures, which suggest that the cultures could have been exposed to some oxygen when injecting stop solution. However, since we saw no increased expression of the peroxiredoxin gene (Kole _1121), we concluded that the effect of oxygen exposure as a result of adding the stop solution is minimal. Thus, since the goal of this study is to elucidate temperature-associated gene expression changes, the cell harvesting method with addition of a phenol-ethanol stop solution was selected for preparation of all subsequent transcriptomes.

Effect of sequencing platform
Unexpectedly, Principal Component Analysis (PCA) revealed that the transcriptomes constructed from the same total RNA sample (K65.2 and K65.IT; K40.2 and K40.ML.IT) did not cluster together ( Fig. S3A). Instead, within each temperature these transcriptomes have grouped with other samples generated with the same sequencing method (Fig. S3A), suggesting the presence of a platformspecific bias. Given that the Ion Torrent libraries resulted in fewer reads (Table S1), the bias could originate from the lower sequencing depth achieved for the Ion Torrent transcriptomes. In support of this hypothesis, in comparisons of K65-2/K65-IT and K40-2/K40-IT transcriptome pairs the highest RPKM value variability was observed mostly for the low abundance transcripts (data not shown), including those transcripts completely undetected in the IT transcriptome of the analyzed pair (Table   3 S4). Therefore, to remove the confounding effects of the observed platform-specific bias, all temperature comparisons were performed using only the Illumina-generated transcriptomes, and the Ion Torrent datasets were used only to confirm observed trends.  (Table S1) predicted 441 operons that collectively contain 1,716 genes and 475 single gene transcripts (Table S2), including 38 predicted transcripts absent from the current genome annotation (NC_012785.1). Among the 441 operons, 65 were identified as single transcripts with confidently defined boundaries, 288 were predicted to contain multiple transcripts, and the remaining 88 had uncertain boundary predictions. Because a large fraction of the identified operons are predicted to produce multiple transcripts, we estimate a minimum of 475+441=916 transcriptional units (TUs) in K. olearia, 52% of which consist of a single gene (Table S2).

One transcriptome (K65-1) probably entered stationary phase
In PCA analysis, the K65-1 Illumina transcriptome was positioned between the other Illumina 65°C transcriptomes and the 77°C transcriptomes ( Fig. S3A). Inspection of its expression profile revealed that many genes significantly up-regulated at 77°C (Table S5) also showed higher expression in the K65-1 transcriptome, including the extreme heat stress sigma factor-24 (rpoE) (Kole_2150), and RNAs ssrA (Kole_R0006) and rnpB (Kole_R0049) ( Table S4, Table S5). On the other hand, other genes involved in heat response, like protease Do (Kole_1599) and the ATP-dependent protease La (Kole_0536), did not show increased expression in K65-1, and most of the genes significantly downregulated at 77°C did not show decreased expression levels in K65-1, suggesting that the K65-1 culture was not under heat stress. Notably, rpoE has been implicated not only in response to temperature stress, but also in a general stress response and in programmed cell lysis in early 4 stationary phase in E. coli (Kabir et al. 2005). Another gene up-regulated in K65-1 (Kole_0545; a ribosome-associated inhibitor, or protein Y) was also shown to be associated with stationary phase, as well as with immediate cold-shock response (Wilson and Nierhaus 2004). Since time of cell harvest was estimated from the growth curves (see Materials and Methods), it can only be considered approximate. Therefore, we suspect that this culture may have entered stationary phase and excluded it from further analyses.

Predicted sources of laterally transferred genes
Among 2,118 of K. olearia's protein-coding genes, 354 (17%) were predicted to be acquired by K.
olearia via lateral gene transfer after the Kosmotogales diverged from other Thermotogae (Table S8; see the main text for discussion of their role in thermoadaptation). Of these, 203 are only present in Kosmotoga spp., while the remaining 151 are shared with Mesotoga spp. (Table S8). Assigning the putative donor lineage as the top-scoring BLASTP match from the Distal group, we infer that the three largest donor lineages are Firmicutes (181 genes), Archaea (45 genes) and Proteobacteria (29 genes). The predicted gene donors are in concordance with lineages previously identified as involved in gene exchange with Thermotogae (Zhaxybayeva et al. 2012;Nelson et al. 1999;Zhaxybayeva et al. 2009).    Table   S1. Gene clusters were formed using the "average linkage" method, Manhattan distance as a distance metric, and log 2 -transformed RPKM values as data. Transcriptomes were clustered using "average linkage" method and Euclidean distances. All plots were produced using the pheatmap package in R (Kolde 2015). Panel A: Clustering of 51 genes potentially involved in energy production during growth on pyruvate (Table S3) post-translational modication, protein turnover, and chaperones, C energy production and conversion, G carbohydrate transport and metabolism, E amino acid transport and metabolism, F nucleotide transport and metabolism, H coenzyme transport and metabolism, I lipid transport and metabolism, P inorganic ion transport and metabolism, Q secondary metabolites biosynthesis, transport, and catabolism, R general function prediction only, S function unknown, NC not in COG database. In addition, locus tags for genes from selected COG categories are color coded as follows: P light green, I dark green, C orange, G red, E dark blue and O light blue. The data used to produce the heatmaps is provided in Table S5. Zoom in to see locus tag labels.  Table S1. Gene clusters were formed using the "average linkage" method, Manhattan distance as a distance metric, and log 2transformed RPKM values as data. Transcriptomes were clustered using "average linkage" method and Euclidean distances as a distance metric. Colors represent log 2 -transformed RPKM values (see color bar to the right of the heatmap). The clustering and plotting was carried out using pheatmap package in R (Kolde 2015). The data used to produce the heatmaps is provided in Table S4. transcriptomes from a given temperature (except K65-1). Error bars refer to one standard deviation across replicates. Asterisks mark the two genes classified as temperature-responsive: Kole_0968

Supplementary References
(significantly differentially expressed in the 40°C vs 65°C comparison) and Kole_0969 (significantly differentially expressed in the 40°C vs 65°C and 30°C vs 65°C comparisons). Fatty acid synthesis genes were identified using KEGG (Kanehisa et al. 2014), as implemented in the IMG portal (Markowitz et al. 2014). (subsets of all temperature-responsive genes) were compared to that in panel A (all temperatureresponsive genes). Note that the proportions were calculated from the total number of temperatureresponsive genes in each subset, and since many genes are up-or down-regulated at more than one temperature treatment, the total exceeds 1.  (Katoh et al. 2002;Katoh and Standley 2013) within Geneious v. 9.1.3. The phylogenetic trees were reconstructed using maximum likelihood method (WAG + Γ substitution model; shape parameter estimated; four rate categories) as implemented in RAxML (Stamatakis 2006) within Geneious v. 9.1.3. Bootstrapping analysis was performed with 100 replicates. Bootstrap support values below 70% are not shown. Panel A: Enoyl-(acyl-carrier-protein) reductase II (fabK, Kole_0970), a gene involved in fatty acid synthesis. Monophyletic Thermotogae genera are collapsed into triangles and color-coded according to their optimal growth temperature: hyperthermophiles are in red, thermophiles are in orange, and mesophiles are in blue. Two clades 10 consisting of Firmicutes are also collapsed. Panel B: Cold shock proteins (Csp) Kole_0109, Kole_1491, and Kole_2064. On this phylogenetic tree, Kole_0109 and its homologs in other Kosmotoga spp. cluster outside the Thermotogae clade, suggesting that it was probably laterally acquired. Notably, in the K. olearia genome Kole_0109 is adjacent to two genes also predicted to have been laterally acquired (Kole_0110 and Kole_0111, see Table 1 and Table S8). Therefore, although Csp proteins are short and highly conserved, resulting in poor bootstrap support (< 70%) for all the branches of the Csp tree, using the combined evidence we hypothesize that Kole_0109, Kole_0110 and Kole_0111 were likely acquired laterally from an unidentified bacterial lineage in a single transfer event. Panel C: PpiC-type PPIases, Kole_1682 and Kole_0383. Only homologs with > 30% amino acid sequence identity to the K. olearia homologs were included in the analysis, since the homologs with lower sequence identity could not be confidently aligned.