Abstract
In Populus, drought is a major problem affecting plant growth and development which can be closely reflected by corresponding transcriptomic changes. Nevertheless, how these changes in Populus are not fully understood. Here, we first used meta-analysis and machine learning methods to identify water stress-responsive genes and then performed a systematic approach to discover important gene networks. Our analysis revealed that large transcriptional variations occur during drought stress. These changes were more associated with the response to stress, cellular catabolic process, metabolic pathways, and hormone-related genes. The differential gene coexpression analysis highlighted two acetyltransferase NATA1-like and putative cytochrome P450 genes that have a special contribution in response to drought stress. In particular, the findings showed that MYBs and MAPKs have a prominent role in the drought stress response that could be considered to improve the drought tolerance of Populus. We also suggest ARF2-like and PYL4-like genes as potential markers for use in breeding programs. This study provides a better understanding of how Populus responses to drought that could be useful for improving tolerance to stress in Populus.
Similar content being viewed by others
Introduction
Climate change is a worldwide problem that has negative impacts on ecological systems, species distribution and forest growth. High temperatures resulting from climate change enhance the frequency and intensity of drought. Such extreme conditions of drought can lead to decrease forest productivity and increase tree mortality1,2,3. Drought has significant effects on plant growth and development; these effects are mainly due to changes in the normal rates of photosynthesis, osmotic adjustment and oxidative damage4,5. Response of plants to drought is a complex mechanism involving morphological, physiological, and metabolic changes6. A variety of protective genes, proteins, and pathways are involved in the protective responses against stress conditions7.
Poplars (Populus sp.) are trees with high ecological and economic values, which are widely distributed around the world and are known as sensitive woody plants to water8. Populus is serves as a model system for biology research and analysis of genetics in forest trees9. Genomic information for the Populus genus provides a valuable resource for investigating genome features and the characterization of stress-related genes in woody plants. Drought stress can cause damage to cellular structures and macromolecules, decreasing the photosynthesis rate, the reduction of growth and biomass production, and sometimes even lead to tree death. Studies have shown that at the molecular level, several genes and multiple biological processes are involved in the adaptation of Populus to stress. For example, water deficit significantly changes expression profiles of numerous genes encoding transcription factors (TFs), plant phosphatases and hormones10. Additionally, several genes such as ProDH, JAZ3 and RAP2.6, LEA and ABA1 have been induced that play an important role in the response to drought stress in Populus11,12. Investigating the transcriptome and discovering different drought-tolerant genes are greatly informative for understanding the mechanisms of plant stress tolerance. Different strategies have been employed to determine the molecular basis of Populus response to drought stress4,13,14,15.
In recent years, high throughput gene expression technologies provide valuable information about transcriptome for research on genes and molecular mechanisms involved in stress. The detection of differentially expressed genes is one of the strategies for data analysis. A wide variety of methods is greatly performed on the dataset with small sample sizes for the screening of genes. Restricting the analysis to an individual study reduces statistical power which may lead to unreliable results. Meta-analysis is a powerful approach for integrating gene expression datasets and obtaining gene signatures more robust and accurate and through increasing sample size. Several studies have used meta-analysis for the identification of responsive genes environmental stress in plants16,17,18,19. Recently, machine learning models are known as an attractive strategy to gain new biological insights. Generally, these methods with efficient dimensionality reduction of the data and feature selection methodologies can integrate transcriptome studies and as a result, find the significant features and underlying mechanisms20. Different algorithms such as support vector machine (SVM) and principal component analysis (PCA) have been employed in feature selection. These algorithms use various evaluation criteria for classifying data and scoring the input features. In the area of plant stress, the effective use of machine learning and feature selection models for selecting gene features is reported in rice21, Arabidopsis22, potato23, and maize24. However, at the transcriptome level, the machine learning algorithms for identifying key signatures related to environmental stress have not been applied in Populus.
Despite the importance of differentially expressed genes identification, this strategy mostly focuses on the discovery of gene contents and suffers from exploring relationships among genes. Coexpression network analyses allow us to obtain a system-level view of gene–gene connections. Moreover, the biological role of genes with unknown functions can be predicted using methods based on coexpression25. There are different approaches including coexpression and differential coexpression analyses for constructing from gene expression data. The coexpression builds networks of genes based on the similarity between expression patterns of the gene pair across all the samples, where can be used for identifying co-activate and regulatory genes. While differential coexpression analysis discovers genes with altered coexpression partners under different states which can lead to defining gene groups affected by the change of state26. These analyses have been widely applied to better understand the molecular mechanisms in plant species16,27.
In this study, we employed large transcriptome data to gain a comprehensive view of drought stress response in Populus. We performed a meta-analysis in combination with machine learning techniques to identify drought-responsive genes. Additionally, through coexpression network analyses, we detected functional sets of genes associated with drought stress. Our findings can provide valuable information about the underlying mechanisms related to a drought stress response which can be used for future genetic improvement and breeding programs in tree species.
Results
To understand the transcriptional responses of Populus to drought stress, 13 microarray datasets (Supplementary Table S1) consisting of 324 arrays in total were considered. After pre-processing and removing the batch effect, the normalized datasets were obtained for further downstream analysis. These datasets were divided into control and stress conditions in each study. From meta-analysis, we identified 3178 genes with differences in expression level between stress and control conditions (FDR < 0.05). Among DEGs, 2060 genes were up-regulated and 1118 genes were down-regulated in drought compared to normal conditions (Supplementary Table S2, Fig. 1). The gene ontology (GO) terms showed that upregulated DEGs were most significantly enriched in response to stress, response to stimulus and cellular catabolic process while the downregulated DEGs were related to small molecule metabolic process, cellular ketone metabolic process and organic acid metabolic process (Supplementary Table S3, Fig. 2A). The most important molecular function terms associated with up-regulated DEGs included phosphoprotein phosphatase activity and phosphoric ester hydrolase activity. Moreover, results of KEGG pathway analysis revealed that the up-regulated DEGs were mostly enriched in metabolic pathways, spliceosome and plant hormone signal transduction, while the down-regulated DEGs were significantly associated with phagosome and amino sugar and nucleotide sugar metabolism (Supplementary Table S3).
Due to the high dimensionality of microarray gene expression data, feature selection techniques as a branch of machine learning can be of great help in distinguishing genes with key biological functions. In this study, to discover the transcriptomic signature of response to drought stress in poplar species, eight feature selection models were implemented on the gene expression dataset in stress and normal groups. In total, 648 genes were identified as the most important features by at least one of the models (Supplementary Table S4). Functional annotation showed that the feature genes enriched in response to abiotic stimulus. In addition, a lot of genes were related to secondary metabolic process. Interestingly, the seven methods selected Auxin response factor 2-like (PtpAffx.211941.1.S1_at) and PYL4-like (PtpAffx.31936.1.A1_at) as important features. Moreover, the meta-analysis and feature selection techniques had 232 common genes among which the genes, 6 genes are involved with the MAPK signaling pathway. Thus, our results indicate key role of MAPKs in Populus during drought stress. Here, we also identified 210 transcription factors from 35 different gene families responding to drought stress. Among them, the MYB, bZIP, NAC, ERF, and bHLH families comprise a high proportion of drought-responsive members (Fig. 2B).
Cis-regulatory element analysis
To discover the conserved motifs and consensus cis-regulatory elements (CREs) in the promoters of DEGs, we applied the MEME tool and identified 14 motifs with lengths ranging from 15 to 50aa (Table 1). We also compared the identified motifs with known motifs in the CIS-BP database. We found that eight of the motifs were matched to the known motifs related to various TFs, including ARID, C2H2, AHL, AP2/B3-like, BBR/BPC, MYB, and Sox (Supplementary Table S5). GO term analysis for motifs revealed that motifs are involved in the regulation of transcription, transmembrane receptor protein tyrosine kinase signaling pathway and development (Table 1, Supplementary Table S6). Interestingly, this analysis highlighted motif associated with response to salicylic acid stimulus (GO:0,009,751). Moreover, some motifs were involved in circadian rhythm and response to auxin stimulus (Supplementary Table S6).
Identification of coexpression networks by WGCNA
We constructed gene coexpression networks using WGCNA based on the gene expression data of DEGs derived from the meta-analysis and feature selection. A total of 5 modules were detected that range from a maximum of 1145 genes of the turquoise module to the 305 genes of the green module (Fig. 3). The biological process gene ontology (GO) analysis revealed that the turquoise module was mainly related to small molecule metabolic process and microtubule cytoskeleton organization. Moreover, the 41 genes in the turquoise module were mainly related to protein transport. The genes in the blue module were mainly enriched in RNA biosynthetic process, whereas the genes in the brown module were mainly enriched in RNA splicing and response to reactive oxygen species. The yellow and green modules were also significantly enriched with the mRNA metabolic process and regulation of RNA metabolic process, respectively. The KEGG pathway analysis showed that phagosome and fatty acid degradation were significantly associated with turquoise and blue modules, respectively. Interestingly, genes in turquoise and green modules were significantly enriched in plant hormone signal transduction (Supplementary Table S7).
Identification of hub genes
To discover the key genes associated with the modules, we identified genes with the most connections for each module that defended as hub genes (Supplementary Table S8, S9). The top 10 genes were screened from each module. For the turquoise module, the top hub was CSLC5-like (PtpAffx.214284.1.S1_at). The top hub genes in the blue module included amino acid transporter AVT6A-like (PtpAffx.7148.2.S1_a_at) and MDAR6-like (PtpAffx.27718.1.S1_s_at). In addition, GIGANTEA-like protein (PtpAffx.25624.1.A1_s_at), E3 ligase Rma1H1-like (PtpAffx.8582.3.S1_a_at) and protein phosphatase 4 (PtpAffx.935.1.S1_at) were also hub genes in the brown, green and yellow modules, respectively (Fig. S1).
Differential gene coexpression network analysis
Generally, differential expression analysis investigates each gene separately and does not consider its relationship with other genes28. To detect changes in the patterns of expression of the genes between normal and stress conditions, we performed differential coexpression analysis and calculated Pearson's correlations of all gene pairs. We detected 28,863 gene pairs to have significant coexpression changes in the conditions at a cutoff of FDR < 0.01 (Supplementary Table S10). Finally, a differential correlation network with 11,344 nodes (genes) was constructed. In differential correlations between two networks, nodes with more connected edges represent genes that have the most distinct expression patterns between the two conditions. Acetyltransferase NATA1-like (PtpAffx.18595.1.S1_at) and putative cytochrome P450 (PtpAffx.209025.1.S1_at) genes with 430 and 365 edges, respectively, were the top differentially connected genes (Fig. 4). Under normal conditions, 7705 of gene pairs, and under stress conditions, 8264 of gene pairs, respectively, showed a positive correlation that most of these genes related to response to abiotic stimulus and small molecule metabolic process. Additionally, 10,422 genes showed the opposite correlation direction between the two conditions. According to the GO analysis of the genes with different correlation directions, these genes are involved in response to stimulus and cellular catabolic process (Supplementary Table S11). The differential coexpression analysis revealed that peptidylprolyl isomerase (PtpAffx.25142.1.A1_at) and thioredoxin-dependent peroxiredoxin (PtpAffx.3462.1.A1_s_at) had the strongest change in correlation where correlation strength decreased from 0.67 to − 0.33 after stress.
Discussion
Drought stress induces a huge range of responses that enable plants to survive in restriction of water availability. Populus is a woody plant model with sufficient genomic information, and due to the importance of this genus for biology studies, we collected comprehensive expression data for drought stress. We designed a pipeline with a systematic view for investigating transcriptional changes during drought stress to identify the genetic architecture in Populus. With meta-analysis and feature selection methods for expression data, we determined 3,594 genes that are differentially regulated by drought stress. Of these DEGs, 232 genes were identified by both methods and confirmed that many of the genes selected by feature selection methods were included among those selected by meta-analysis. The GO enrichment analysis suggests that DEGs are involved in a large range of biological functions. The majority of these GO terms were associations with metabolic process, response to stress, catabolic process. Both up-and down-regulated DEGs were enriched for the secondary metabolic process (Supplementary Table S3). However, more genes that belonged to this class were primarily up-regulated. The accumulation of secondary metabolites is an important part of a plant’s protection strategy against environmental stress29. The accumulation of phenolic and flavonoid compounds has been reported in Populus under drought stress30. A survey of KEGG pathways detected 17 pathways (Supplementary Table S3). The results revealed that the plant hormone signal transduction pathway was enriched. In this pathway, 17 DEGs such as STK (PTPAFFX.163630.1.S1_AT) and ABF1 (PTPAFFX.215822.1.S1_S_AT) were upregulated. Plant hormones act as central integrators of signaling cascades in adaptation to stress31. A previous study demonstrated that ABA, ethylene, brassinosteroid and jasmonate signal pathways may be involved in Populus tomentosa cold response processes32. The STK and ABF1 are associated with responses to drought stress in the plant33. The pathway analysis also demonstrated that drought stress had a significant effect on the carbon metabolism (Supplementary Table S3), indicating that Populus may has a defensive response through conserving energy in stress conditions. We also found that the expression changes of six genes related to the MAPK signaling pathway occurred during drought stress, which suggested that the MAPKs have key roles in the drought response in Populus. Among DEGs, transcription factors such as bZIPs, NACs and ERFs, which are known to be involved in stresses were also detected. The bZIP-type TFs play major roles in the regulation of development and salt stress responses of Populus34. Additionally, 24 members of the MYB family had a difference in expression between conditions (Fig. 2B). Previous studies have demonstrated that MYBs participate in various processes and have important functions in plant stress tolerance35,36. In Tamarix hispida, the overexpression of ThMYB13 enhances salt stress tolerance37. These results imply a regulatory role of MYBs in Populus for drought response.
According to most feature selection models, two genes ARF2 -like (PtpAffx.211941.1.S1_at) and PYL4-like (PtpAffx.31936.1.A1_at) defined as the important features (Supplementary Table S4) which are involved in several biological processes such as cellular metabolism and development regulation. ARF plays an important role in hormone regulation and plant stress response38. Recently, an ARF was discovered that is crucial for early developing xylem in Populus39. It suggests that the expression change of ARF2 may be a reaction mechanism against water stress by impressing xylem function. In addition, The PYL genes are abscisic acid receptors that have a role in regulating drought tolerance40. A previous study showed that overexpression of PYL4 in Arabidopsis resulted in drought resistance41.
To obtain further insights into the mechanism of gene regulation during drought stress in Populus, motifs and transcription factor binding sites were discovered in promoter regions of DEGs. We found 14 distinct conserved motifs comprised of ARID, C2H2, AHL, AP2/B3-like, BBR/BPC, MYB, and Sox (Table 1, Supplementary Table S5). The signal transduction, regulation of transcription and response to salicylic acid were important functional categories for these motifs (Supplementary Table S6). Previous studies showed the role of MYB in various biological processes, especially responses to environmental stresses42. The MYB binding site-cis-element is necessary for the gene expression of drought-inducible genes43. In Populus, the expression changes of MYB genes are observed under different stresses. These genes may also participate in the regulation of circadian rhythms, senescence-related ABA signaling cascade and cell-fate determination44. Yang et al.45 in their study detected cis-acting elements involved in abscisic acid and stress responsiveness in upstream regions of PtrMYB053 and PtrMYB081. The AHL is a group of transcriptional regulators with a highly conserved structure in the plant kingdom and extensive information demonstrates function in plant growth and development as well as stress responses46. A previous study identified 37 AHL genes in Populus trichocarpa that expression levels of these genes were induced by drought stress47.
To assess the interactions among DEGs, a WGCNA was performed that can help to better understand the mechanisms involved in drought stress. The DEGs were grouped into 5 modules (Fig. 3). Functional analysis results showed that the genes in the modules were mostly associated with molecule metabolic process, RNA biosynthetic process, RNA splicing, and stress response (Supplementary Table S7). Some DEGs included in the turquoise module belonged to a generation of precursor metabolites and energy, and transport that reflecting their importance in response to stress. The saving of energy and maintenance of homeostasis under stress is considered an adaptive feature. During stress, the plant through adjustments of metabolism and gene expression shunt energy sources from processes of growth to adaptation48. The role of transport as a key regulatory molecule in sugar metabolism, abscisic acid signaling, stress responses and enhancing photosynthetic activity has been established49,50. The turquoise module included an ABC-2 type transporter (PtpAffx.85773.1.S1_at) and TRH1 (PTP.121.1.S1_S_AT). ABC transporters are necessary for the internal detoxification of ions and act as a regulator for maintaining plant homeostasis under environmental stresses51. Previous studies have proposed that TRH1 is required for root hair elongation and participates in auxin translocation and accumulation52,53.
Genes in the brown module were involved in RNA splicing and response to reactive oxygen species. Environmental stresses damage DNA and affect its stability. Studies demonstrate the importance of alternative splicing in DNA repair for plants54. Among genes in response to reactive oxygen species, we identified two HSP15- like (PTPAFFX.53720.1.S1_AT) and APX3- like (PTP.4040.1.S1_S_AT). Heat shock proteins are ubiquitous that accumulate in response to abiotic stress55,56. In Populus euphratica, HSPs had high transcript levels after stress57. In our results, HSP15-like (PTPAFFX.53720.1.S1_AT) significantly induced under drought stress. The yellow module genes enriched for vacuole organization. Vacuoles play a major function in plant trafficking pathways and response to environmental signaling and participate in oxidative-stress resistance58.
We also identified the gene with the highest connectivity (hubs) within each module (Supplementary Table S9) which are the key components of the networks. We found that the hub genes were enriched in the response to misfolded protein. The highly connected gene in the turquoise module, including CSLC5-like (PtpAffx.214284.1.S1_at) has been reported to be related to developing cell walls59. It has been shown that activation of CSLC5-like is essential for protection from adverse environmental conditions. The hub gene of the blue module (protein phosphatase 4) was involved in a variety of cellular functions60, suggesting a change in phosphorylation status is one of the most important regulatory ways to stress responses in all Populus. Interestingly, amino acid transporter AVT6A-like (PtpAffx.7148.2.S1_a_at) and PLATZ1 (PtpAffx.27718.1.S1_s_at) were the most interconnected genes in the blue module. PLATZ encodes a class of plant-specific zinc-finger transcription factor that functions in plant growth, development and abiotic stresses through ABA61. Among the hub genes in the blue module, uspA-like (Ptp.4430.1.S1_a_at) is of interest. Universal stress proteins (USP) can arrest cell growth and constitute a biological defense mechanism in stress conditions62. In particular, GIGANTEA-like protein (PtpAffx.25624.1.A1_s_at) was identified as one of the hub genes in brown module. This gene functions in developmental stage transitions and stress responses. Recent findings exhibit that GIGANTEA-like protein mediates circadian rhythm and responses to stress in poplar63.
To identify significant changes in the coexpression structure and provide an overview of gene expression interactions between normal and stress conditions, we performed differential gene coexpression analysis. We identified over 28,000 gene pairs whose expression levels were significantly correlated. We obtained a network with 11,344 genes (Fig. 4, Supplementary Table S10) which among them, 1460 genes were DEGs. Genes Acetyltransferase NATA1-like (PtpAffx.18595.1.S1_at) and putative cytochrome P450 (PtpAffx.209025.1.S1_at) had most connections in the network. NATA1 has implicated the regulatory function of polyamine acetylation and plays a function in the complex cross-talk between salicylate and jasmonate signaling64. A previous study indicated that NATA1 is also involved in ABA-mediated stomatal closure in A. thaliana65. Cytochrome P450s are also one of the largest enzyme families and play an essential role in stress response. These key enzymes have also been detected in Populus trichocarpa that participate in the volatile formation66.
The results also indicated that the coexpression between peptidylprolyl isomerase (PtpAffx.25142.1.A1_at) and thioredoxin-dependent peroxiredoxin (PtpAffx.3462.1.A1_s_at) genes was extensively changed after stress. The peptidylprolyl isomerase was found associated with the folding of newly synthesized proteins in the cellular processes67. Protein folding is a fundamental process for cell survival under adverse environmental conditions68. It was suggested that peptidylprolyl isomerases have a role in abiotic stress response in plants69. In addition, thioredoxin-dependent peroxiredoxin as part of antioxidant defenses and redox signaling is essential for cellular response against oxidative stress70. Previous observation implies a possible link between mitochondrial thioredoxin system and peptidyl prolyl cis–trans isomerase activity71.
Collectively, in this study, we implemented a strategy for the detection of genes and underlying mechanisms involved in response to drought stress in Populus. Based on the meta-analysis, feature selection algorithms and coexpression analysis, we highlighted genes and pathways that play key roles during drought stress. The findings revealed that transcriptional changes under drought stress can be extremely varied in Populus. These results showed that drought stress induces a complex of hormone signaling pathways. In addition, TFs such as MYBs and bZIPs are key for Populus response. In particular, our analysis suggests that ARF2-like (PtpAffx.211941.1.S1_at) and PYL4-like (PtpAffx.31936.1.A1_at) genes can be potential candidates for screening and breeding purposes in Populus. The results obtained can pave the way for understanding the molecular basis of drought response and for further investigations in Populus.
Methods
Data collection and preprocessing
The array expression datasets were retrieved from the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/gds) and the ArrayExpress (https://www.ebi.ac.uk/arrayexpress) (Supplementary Table S1). The data consisted of 13 studies that generated from the Affymetrix GeneChip Poplar Genome Array. The chip description file (CDF) and annotation file were downloaded from the Affymetrix site. The robust multichip average (RMA) algorithm was used to background corrected and normalized gene expression data72 from the Affy R package. Moreover, genes that had low mean and variation in expression values were filtered out. Finally, an empirical Bayes method was performed to correct non-biological differences and remove batch effects from gene expression datasets using ComBat function in the SVA R package73.
Differential gene expression analysis
To identify upregulated and downregulated differentially expressed genes (DEGs) among normal and drought samples, meta-analysis was performed using the effect size combination method in the metaMA R package74. Genes with an FDR ≤ 0.05 were defined as DEGs.
Feature selection algorithms were employed to reduce the dimensionality of the expression dataset and identify the gene expression features between normal and stress conditions. We implemented various attribute weighting algorithms, including Support Vector Machine (SVM), Chi Squared, Information Gain, Information Gain Ratio, Deviation, Gini Index, Uncertainty, Relief, and PCA on corrected data to identify the most important genes75. The feature selection was performed using RapidMiner Studio software (version 7.0.001).
Functional analysis
To investigate the functions related to the DEGs, we conducted the Gene Ontology (GO) using the AgriGO (http://systemsbiology.cau.edu.cn/agriGOv2/). The web server REVIGO was used to remove redundant enriched GO terms. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis was performed by gProfiler tool (https://biit.cs.ut.ee/gprofiler/gost). GO terms and KEGG pathways with an adjusted P-value < 0.05 were defined as significant. In addition, transcription factors (TFs) families were obtained by BLASTX search against Populus trichocarpa transcription factors (http://planttfdb.cbi.pku.edu) with a cut-off of E ≤ 10−6.
Cis-elements analysis
To discover conserved motifs and investigate cis-elements in DEGs, the 1kbp upstream sequences of genes were extracted from Ensembl Plants (http://plants.ensembl.org) and then were submitted to a search by the MEME program (http://meme-suite.org/). The Tomtom v 5.4.1 tool (http://meme-suite.org/tools/tomtom)76 was employed for the identification of known TF binding sites based on the CIS-BP database. The GoMo tool (http://meme-suite.org/tools/gomo) was also used for the biological function of motifs.
Coexpression network analysis and identification of potential hub genes
The weighted gene correlation network analysis (WGCNA)77 was used to construct the co-expression network and identify gene modules. First, the gene expression similarity matrix was built based on normalized gene expression data of the DEGs by determining Pearson’s correlation coefficient (PCC) between gene pairs. Subsequently, the gene expression similarity matrix was transformed into an adjacency matrix using an appropriate soft threshold value (β) of 10. Then, the adjacency matrix was converted into a topological matrix by the topological overlap matrix (TOM) by the TOM similarity algorithm. Finally, the dynamic tree-cut algorithm method was used to identify the modules of co-expressed DEGs with the maxBlockSize of 6,000, minModuleSize of 30 and mergeCutHeight of 0.2. In addition, according to gene intramodular connectivity, genes with a high kME were considered hub genes in each module. The gProfiler tool was used to perform GO and KEGG functional enrichment analysis of the significant modules. We used the cystoscope to visualize the networks.
Differential coexpression analysis
To identify significant differential correlations between normal and stress conditions, a differential coexpression gene network was constructed using the DiffCorr R package78. First, the expression data were divided into normal and stress groups, and then genes with zero expression or low variation across samples were removed that resulting in 30,865 genes selected for further analysis. Pearson correlations were calculated between each pair of genes for each group and each correlation value was transformed to Z score using Fisher’s Z transformation. Next, the difference in correlation of a gene pair between normal and stress conditions was calculated by the equation: \(\Delta \mathrm{Z}=\frac{{Z}_{S}-{Z}_{N}}{\sqrt{\frac{1}{{n}_{\mathrm{S}}-3} + \frac{1}{{\mathrm{n}}_{\mathrm{N}}-3}}}\) where nN and nS are respectively the numbers of samples under normal and stress conditions. Finally, Fisher’s z-test was applied to detect significant differential correlations. Differential correlations with a FDR < 0.01 were considered significant79. The differential coexpression network was visualized by Cytoscape. A schematic workflow summarizing the major steps of this study is shown in Fig. 5.
Data availability
All data generated or analysed during this study are available from the corresponding authors on reasonable request.
Change history
22 February 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41598-023-30033-z
References
Boisvenue, C. & Running, S. W. Impacts of climate change on natural forest productivity–evidence since the middle of the 20th century. Glob. Change Biol. 12, 862–882 (2006).
Seidl, R., Schelhaas, M.-J., Rammer, W. & Verkerk, P. J. Increasing forest disturbances in Europe and their impact on carbon storage. Nat. Clim. Chang. 4, 806–810 (2014).
Clark, J. S. et al. The impacts of increasing drought on forest dynamics, structure, and biodiversity in the United States. Glob. Change Biol. 22, 2329–2352 (2016).
Yin, C., Peng, Y., Zang, R., Zhu, Y. & Li, C. Adaptive responses of Populus kangdingensis to drought stress. Physiol. Plant. 123, 445–451 (2005).
Hussain, H. A. et al. Chilling and drought stresses in crop plants: Implications, cross talk, and potential management opportunities. Front. Plant Sci. 9, 393 (2018).
Feng, W., Lindner, H., Robbins, N. E. & Dinneny, J. R. Growing out of stress: The role of cell-and organ-scale growth control in plant water-stress responses. Plant Cell 28, 1769–1782 (2016).
Zhu, J.-K. Abiotic stress signaling and responses in plants. Cell 167, 313–324 (2016).
Tyree, M. T., Alexander, J. & Machado, J.-L. Loss of hydraulic conductivity due to water stress in intact juveniles of Quercus rubra and Populus deltoides. Tree Physiol. 10, 411–415 (1992).
Taylor, G. Populus: Arabidopsis for forestry. Do we need a model tree?. Ann. Bot. 90, 681–689 (2002).
Zhang, J., Huang, D., Zhao, X. & Zhang, M. Evaluation of drought resistance and transcriptome analysis for the identification of drought-responsive genes in Iris germanica. Sci. Rep. 11, 1–21 (2021).
Peng, S. et al. Transcriptional profiling reveals sexual differences of the leaf transcriptomes in response to drought stress in Populus yunnanensis. Tree Physiol. 32, 1541–1555 (2012).
Dash, M., Yordanov, Y. S., Georgieva, T., Wei, H. & Busov, V. Gene network analysis of poplar root transcriptome in response to drought stress identifies a PtaJAZ3PtaRAP2. 6-centered hierarchical network. PLoS ONE 13, e0208560 (2018).
Tang, S. et al. Populus euphratica: The transcriptomic response to drought stress. Plant Mol. Biol. 83, 539–557 (2013).
Li, B., Qin, Y., Duan, H., Yin, W. & Xia, X. Genome-wide characterization of new and drought stress responsive microRNAs in Populus euphratica. J. Exp. Bot. 62, 3765–3779 (2011).
Yan, D.-H., Fenning, T., Tang, S., Xia, X. & Yin, W. Genome-wide transcriptional response of Populus euphratica to long-term drought stress. Plant Sci. 195, 24–35 (2012).
Tahmasebi, A. & Niazi, A. Comparison of transcriptional response of C3 and C4 plants to drought stress using meta-analysis and systems biology approach. Front. Plant Sci. 12, 1295 (2021).
Tahmasebi, A., Ashrafi-Dehkordi, E., Shahriari, A. G., Mazloomi, S. M. & Ebrahimie, E. Integrative meta-analysis of transcriptomic responses to abiotic stress in cotton. Prog. Biophys. Mol. Biol. 146, 112–122 (2019).
Rest, J. S., Wilkins, O., Yuan, W., Purugganan, M. D. & Gurevitch, J. Meta-analysis and meta-regression of transcriptomic responses to water stress in Arabidopsis. Plant J. 85, 548–560 (2016).
Ashrafi-Dehkordi, E., Alemzadeh, A., Tanaka, N. & Razi, H. Meta-analysis of transcriptomic responses to biotic and abiotic stress in tomato. PeerJ 6, e4631 (2018).
Arowolo, M. O., Adebiyi, M. O., Aremu, C. & Adebiyi, A. A. A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector. J. Big Data 8, 1–17 (2021).
Shaik, R. & Ramakrishna, W. Machine learning approaches distinguish multiple stress conditions using stress-responsive genes and identify candidate genes for broad resistance in rice. Plant Physiol. 164, 481–495 (2014).
Liang, Y. et al. Prediction of drought-resistant genes in Arabidopsis thaliana using SVM-RFE. PLoS ONE 6, e21750 (2011).
Sprenger, H. et al. Metabolite and transcript markers for the prediction of potato drought tolerance. Plant Biotechnol. J. 16, 939–950 (2018).
Meng, X. et al. Predicting transcriptional responses to cold stress across plant species. Proc. Natl. Acad. Sci. 118, e2026330118 (2021).
Rhee, S. Y. & Mutwil, M. Towards revealing the functions of all genes in plants. Trends Plant Sci. 19, 212–221 (2014).
Van Dam, S., Vosa, U., van der Graaf, A., Franke, L. & de Magalhaes, J. P. Gene co-expression analysis for functional classification and gene–disease predictions. Brief. Bioinform. 19, 575–592 (2018).
Fukushima, A. et al. Exploring tomato gene functions based on coexpression modules using graph clustering and differential coexpression approaches. Plant Physiol. 158, 1487–1502 (2012).
Jiang, Z., Dong, X., Li, Z.-G., He, F. & Zhang, Z. Differential coexpression analysis reveals extensive rewiring of Arabidopsis gene coexpression in response to Pseudomonas syringae infection. Sci. Rep. 6, 1–13 (2016).
Austen, N., Walker, H. J., Lake, J. A., Phoenix, G. K. & Cameron, D. D. The regulation of plant secondary metabolism in response to abiotic stress: Interactions between heat shock and elevated CO2. Front. Plant Sci. 10, 1463 (2019).
Ahmed, U. et al. Expression profiling of flavonoid biosynthesis genes and secondary metabolites accumulation in populus under drought stress. Molecules 26, 5546 (2021).
Golldack, D., Li, C., Mohan, H. & Probst, N. Tolerance to drought and salt stress in plants: Unraveling the signaling networks. Front. Plant Sci. 5, 151 (2014).
Yang, X. et al. Transcriptome profiling of Populus tomentosa under cold stress. Ind. Crops Prod. 135, 283–293 (2019).
Sah, S. K., Reddy, K. R. & Li, J. Abscisic acid and abiotic stress tolerance in crop plants. Front. Plant Sci. 7, 571 (2016).
Zhao, K. et al. Genome-wide analysis and expression profile of the bZIP gene family in poplar. BMC Plant Biol. 21, 1–16 (2021).
Li, J., Han, G., Sun, C. & Sui, N. Research advances of MYB transcription factors in plant stress resistance and breeding. Plant Signal. Behav. 14, 1613131 (2019).
Wang, X., Niu, Y. & Zheng, Y. Multiple functions of MYB transcription factors in abiotic stress responses. Int. J. Mol. Sci. 22, 6125 (2021).
Zhang, T., Zhao, Y., Wang, Y., Liu, Z. & Gao, C. Comprehensive analysis of MYB gene family and their expressions under abiotic stresses and hormone treatments in Tamarix hispida. Front. Plant Sci. 9, 1303 (2018).
Li, S.-B., Xie, Z.-Z., Hu, C.-G. & Zhang, J.-Z. A review of auxin response factors (ARFs) in plants. Front. Plant Sci. 7, 47 (2016).
Xu, C. et al. Auxin-mediated Aux/IAA-ARF-HB signaling cascade regulates secondary xylem development in Populus. New Phytol. 222, 752–767 (2019).
Yu, J. et al. Overexpression of pyrabactin resistance-like abscisic acid receptors enhances drought, osmotic, and cold tolerance in transgenic poplars. Front. Plant Sci. 8, 1752 (2017).
Pizzio, G. A. et al. The PYL4 A194T mutant uncovers a key role of PYR1-LIKE4/PROTEIN PHOSPHATASE 2CA interaction for abscisic acid signaling and plant drought resistance. Plant Physiol. 163, 441–455 (2013).
Li, C., Ng, C.K.-Y. & Fan, L.-M. MYB transcription factors, active players in abiotic stress signaling. Environ. Exp. Bot. 114, 80–91 (2015).
Kaur, A., Pati, P. K., Pati, A. M. & Nagpal, A. K. In-silico analysis of cis-acting regulatory elements of pathogenesis-related proteins of Arabidopsis thaliana and Oryza sativa. PLoS ONE 12, e0184523 (2017).
Yang, X. et al. Genome-wide analysis of the MYB-related transcription factor family and associated responses to abiotic stressors in Populus. Int. J. Biol. Macromol. 191, 359–376 (2021).
Yang, X. et al. Comprehensive analysis of the R2R3-MYB transcription factor gene family in Populus trichocarpa. Ind. Crops Prod. 168, 113614 (2021).
Zhao, J., Favero, D. S., Qiu, J., Roalson, E. H. & Neff, M. M. Insights into the evolution and diversification of the AT-hook motif nuclear localized gene family in land plants. BMC Plant Biol. 14, 1–19 (2014).
Wang, H. et al. Comprehensive analysis of AHL gene family and their expression under drought stress and ABA treatment in Populus trichocarpa. PeerJ 9, e10932 (2021).
Baena-González, E. Energy signaling in the regulation of gene expression during stress. Mol. Plant 3, 300–313 (2010).
Li, P. et al. Plant transporters: Roles in stress responses and effects on growth and development. Plant Growth Regul. 93, 1–14 (2021).
Sanaullah, T. et al. Transporters and Plant Osmotic Stress 307–344 (Elsevier, Acadamic press, 2021).
Dahuja, A. et al. Role of ATP-binding cassette transporters in maintaining plant homeostasis under abiotic and biotic stresses. Physiol. Plant. 171, 785–801 (2021).
Rigas, S. et al. TRH1 encodes a potassium transporter required for tip growth in Arabidopsis root hairs. Plant Cell 13, 139–151 (2001).
Zou, N., Li, B., Dong, G., Kronzucker, H. J. & Shi, W. Ammonium-induced loss of root gravitropism is related to auxin distribution and TRH1 function, and is uncoupled from the inhibition of root elongation in Arabidopsis. J. Exp. Bot. 63, 3777–3788 (2012).
Nimeth, B. A., Riegler, S. & Kalyna, M. Alternative splicing and DNA damage response in plants. Front. Plant Sci. 11, 91 (2020).
Sun, X. et al. AsHSP17, a creeping bentgrass small heat shock protein modulates plant photosynthesis and ABA-dependent and independent signalling to attenuate plant response to abiotic stress. Plant Cell Environ. 39, 1320–1337 (2016).
Zhang, J. et al. Hsf and Hsp gene families in Populus: Genome-wide identification, organization and correlated expression during development and in stress responses. BMC Genom. 16, 1–19 (2015).
Chen, J., Yin, W. & Xia, X. Transcriptome profiles of Populus euphratica upon heat shock stress. Curr. Genomics 15, 326–340 (2014).
Zhang, C., Hicks, G. R. & Raikhel, N. V. Plant vacuole morphology and vacuolar trafficking. Front. Plant Sci. 5, 476 (2014).
Gigli-Bisceglia, N., Engelsdorf, T. & Hamann, T. Plant cell wall integrity maintenance in model plants and crop species-relevant cell wall components and underlying guiding principles. Cell. Mol. Life Sci. 77, 2049–2077 (2020).
Lillo, C. et al. Protein phosphatases PP 2A, PP 4 and PP 6: Mediators and regulators in development and responses to environmental cues. Plant Cell Environ. 37, 2631–2648 (2014).
Zhang, S. et al. Expression of cotton PLATZ1 in transgenic Arabidopsis reduces sensitivity to osmotic and salt stress for germination and seedling establishment associated with modification of the abscisic acid, gibberellin, and ethylene signalling pathways. BMC Plant Biol. 18, 1–11 (2018).
Tkaczuk, K. L. et al. Structural and functional insight into the universal stress protein family. Evol. Appl. 6, 434–449 (2013).
Ke, Q. et al. Down-regulation of GIGANTEA-like genes increases plant growth and salt stress tolerance in poplar. Plant Biotechnol. J. 15, 331–343 (2017).
Lou, Y.-R., Bor, M., Yan, J., Preuss, A. S. & Jander, G. Arabidopsis NATA1 acetylates putrescine and decreases defense-related hydrogen peroxide accumulation. Plant Physiol. 171, 1443–1455 (2016).
Jammes, F. et al. Acetylated 1, 3-diaminopropane antagonizes abscisic acid-mediated stomatal closing in A rabidopsis. Plant J. 79, 322–333 (2014).
Irmisch, S. et al. Two herbivore-induced cytochrome P450 enzymes CYP79D6 and CYP79D7 catalyze the formation of volatile aldoximes involved in poplar defense. Plant Cell 25, 4737–4754 (2013).
Shaw, P. E. Peptidyl-prolyl isomerases: A new twist to transcription. EMBO Rep. 3, 521–526 (2002).
Wang, W., Vinocur, B., Shoseyov, O. & Altman, A. Role of plant heat-shock proteins and molecular chaperones in the abiotic stress response. Trends Plant Sci. 9, 244–252 (2004).
Kurek, I., Aviezer, K., Erel, N., Herman, E. & Breiman, A. The wheat peptidyl prolyl cis-trans-isomerase FKBP77 is heat induced and developmentally regulated. Plant Physiol. 119, 693–704 (1999).
Zhang, H. et al. A novel thioredoxin-dependent peroxiredoxin (tpx-q) plays an important role in defense against oxidative stress and is a possible drug target in Babesia microti. Front. Vet. Sci. 7, 76 (2020).
Folda, A. et al. Mitochondrial thioredoxin system as a modulator of cyclophilin D redox state. Sci. Rep. 6, 1–11 (2016).
Irizarry, R. A. et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–264 (2003).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Marot, G., Foulley, J.-L., Mayer, C.-D. & Jaffrézic, F. Moderated effect size and P-value combinations for microarray meta-analyses. Bioinformatics 25, 2692–2699 (2009).
Mohammadi-Dehcheshmeh, M. et al. Unified transcriptomic signature of arbuscular mycorrhiza colonization in roots of Medicago truncatula by integration of machine learning, promoter analysis, and direct merging meta-analysis. Front. Plant Sci. 9, 1550 (2018).
Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. & Noble, W. S. Quantifying similarity between motifs. Genome Biol. 8, 1–9 (2007).
Langfelder, P. & Horvath, S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics 9, 1–13 (2008).
Fukushima, A. DiffCorr: An R package to analyze and visualize differential correlations in biological networks. Gene 518, 209–214 (2013).
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Nat. Prec. https://doi.org/10.1038/npre.2010.4282.1 (2010).
Author information
Authors and Affiliations
Contributions
A.T. designed the research; A.T. and S.A. collected the data; A.T., S.A. and A.N. analyzed the data; and A.T., S.A. and A.N. wrote the paper. All authors read and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: The original version of this Article contained an error in the order of the Figures. Figures 1, 2, 3, 4 and 5 were published as Figure 5, 2, 3, 4 and 1.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tahmasebi, A., Niazi, A. & Akrami, S. Integration of meta-analysis, machine learning and systems biology approach for investigating the transcriptomic response to drought stress in Populus species. Sci Rep 13, 847 (2023). https://doi.org/10.1038/s41598-023-27746-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-27746-6
- Springer Nature Limited
This article is cited by
-
Genome-wide profiling of histone (H3) lysine 4 (K4) tri-methylation (me3) under drought, heat, and combined stresses in switchgrass
BMC Genomics (2024)
-
Molecular insights and omics-based understanding of plant–microbe interactions under drought stress
World Journal of Microbiology and Biotechnology (2024)
-
Photosynthesis regulation, cell membrane stabilization and methylglyoxal detoxification seems major altered pathways under cold stress as revealed by integrated multi-omics meta-analysis
Physiology and Molecular Biology of Plants (2023)