Genome-wide identification of Gossypium INDETERMINATE DOMAIN genes and their expression profiles in ovule development and abiotic stress responses
- 274 Downloads
INDETERMINATE DOMAIN (IDD) transcription factors form one of the largest and most conserved gene families in plant kingdom and play important roles in various processes of plant growth and development, such as flower induction in term of flowering control. Till date, systematic and functional analysis of IDD genes remained infancy in cotton.
In this study, we identified total of 162 IDD genes from eight different plant species including 65 IDD genes in Gossypium hirsutum. Phylogenetic analysis divided IDDs genes into seven well distinct groups. The gene structures and conserved motifs of GhIDD genes depicted highly conserved exon-intron and protein motif distribution patterns. Gene duplication analysis revealed that among 142 orthologous gene pairs, 54 pairs have been derived by segmental duplication events and four pairs by tandem duplication events. Further, Ka/Ks values of most of orthologous/paralogous gene pairs were less than one suggested the purifying selection pressure during evolution. Spatiotemporal expression pattern by qRT-PCR revealed that most of the investigated GhIDD genes showed higher transcript levels in ovule of seven days post anthesis, and upregulated response under the treatments of multiple abiotic stresses.
Evolutionary analysis revealed that IDD gene family was highly conserved in plant during the rapid phase of evolution. Whole genome duplication, segmental as well as tandem duplication significantly contributed to the expansion of IDD gene family in upland cotton. Some distinct genes evolved into special subfamily and indicated potential role in the allotetraploidy Gossypium hisutum evolution and development. High transcript levels of GhIDD genes in ovules illustrated their potential roles in seed and fiber development. Further, upregulated responses of GhIDD genes under the treatments of various abiotic stresses suggested them as important genetic regulators to improve stress resistance in cotton breeding.
KeywordsUpland cotton IDD transcription factor Gene duplication Collinearity Spatiotemporal expression Abiotic stresses
Basic Local Alignment Search Tool
Quantitative real time polymerase chain reaction
Whole genome duplication
Transcription factors containing DNA binding domains play an important role in many biological processes in almost all living organisms. They function as either repressors or activators, depending on whether they inhibit or stimulate the transcription of target genes. Transcription factors of the same family generally have distinct actions because of differences in their domains and protein regions that tend to diverge from one another (Eveland et al. 2014).
According to the quantity and arrangement of cysteine (C) and histidine (H) residues, the transcription factors containing zinc fingers fall into five classes (C2H2, C3H, C2C2 (GATA finger), C3HC4 (RING finger), and C2HC5 (LIM finger)) (Moreno-Risueno et al. 2015). As one of the largest transcription factor families, C2H2 zinc-finger transcription factors are structurally characterized by the amino acid sequence F/Y-X-C-X2–5-C-X3-F/Y-X5-Ψ-X2-H-X3–5-H, where X is any amino acid while Ψ represents a hydrophobic residue (Fan et al. 2017). Two cysteine (C) and histidine (H) residues coordinate a zinc ion and interact with the major groove of DNA by folding two β-sheets and one α-helix (Lee et al. 1989; Parraga et al. 1988). INDETERMINATE (IDD) (Riddick and Simmons 2014) gene family encoding transcription factors containing a C2H2 (Cys2His2) zinc-finger domain (Colasanti et al. 2006) have been investigated to involve in animals (Riechmann et al. 2000; Takatsuji 1998). Previously, it has been reported that zinc-finger family was only 19% conserved among other eukaryotes except plants (Englbrecht et al. 2004; Pabo and Sauer 1992) suggesting that extensive duplication resulted in the expansion of zinc-finger gene family in plants (Coelho et al. 2018).
It’s known that IDD proteins have multiple functions in plant development. In maize (Zea mays), three IDDs have been characterized. ID1 gene was first reported to induce phase transition from vegetative to reproductive growth in maize (Colasanti et al. 1998). In rice, OsID1/Ehd2/RID1 has also been found to play an important role in mediating flower initiation besides vegetative to reproductive growth phase transition (Colasanti et al. 2006; Matsubara et al. 2008; Park et al. 2008; Wong and Colasanti 2007; Wu et al. 2008). Furthermore, OsIDD10 is involved in ammonium absorption and nitrogen metabolism (Xuan et al. 2013). In Arabidopsis, 16 IDD genes were identified (Colasanti et al. 2006). Among them, AtIDD8 and AtIDD14 play an important role in sugar and starch metabolism (Ingkasuwan et al. 2012). AtIDD8 is phosphorylated by AKIN10 and its loss of function mutant idd8–3 exhibited later flowering in Arabidopsis. Moreover, SnRK1 interacts with AtIDD8 to control sugar metabolism during the flowering transition (Jeong et al. 2015). Similarly, AtIDD15 has been reported to participate in sugar and starch metabolism (Tanimoto et al. 2008), as well as in gravitropic response, while AtIDD3 and AtIDD8 are involved in root development (Ingkasuwan et al. 2012). AtIDD10 (JKD) is essential for the precise expression of GL2 (GLABRA2), CPC(CAPRICE), and WER(WEREWOLF) and has been proposed that JKD acts in the cortex to define root hair cells in the epidermis (Hassan et al. 2010). Moreover, AtIDD9 plays a role in epidermal cell fate specification (Long et al. 2015a; Long et al. 2015b). Additionally, AtIDD3 binds to the SCL3 promoter to control plant development, and regulate the expression of downstream genes in gibberellin (GA) signaling dependent manner (Yoshida et al. 2014). AtIDD14, AtIDD15, and AtIDD16 regulate the expression of genes involved in auxin biosynthesis, thereby influencing organ morphogenesis (Cui et al. 2013).
Cotton (Gossypium hirsutum L.) is the preeminent source of natural fiber and is cultivated worldwide (Rinehart et al. 1996). It provides important raw material for textile industry. However, low fiber quality and yield are the main limiting factors affecting its overall world contribution and consumption. Cotton faces several environmental and abiotic stresses that restrict its growth and productivity. The roles of IDDs have been well-described in the growth and development of model plants like Arabidopsis, rice and maize. However, investigation of IDD genes in upland cotton remained elusive. Present study shows the systematic analysis of IDD genes in G. hirsutum using a genome-wide structure depiction, spatiotemporal expression patterns and stress responses investigations. Total of 65 GhIDD gene family members were identified and further characterized to explore the phylogenetic relationships, chromosome locations, gene duplication, gene structures, conserved motifs and spatiotemporal expression patterns and responses of GhIDD genes under various abiotic stresses. This study will help to understand the evolution of GhIDD genes and provide the foundation to explore the functional mechanism of GhIDD genes in plant growth, fiber development and abiotic stress tolerance in cotton.
Identification and chemical characterization of IDD family members
The protein sequences of 16 IDD genes from Arabidopsis thaliana were used as queries for the computational identification of IDD genes in Gossypium arboreum (ICR, version 1.0), G. hirsutum (NAU, version 1.1), G. raimondii (JGI, version 2.0), Oryza sativa (version 7.0), Zea mays (version 1.1), Physcomitrella patens (moss) (version 3.3), Selaginella moellendorffii (fern) (version 1.0), Theobroma cacao (version 1.1) and Chlamydomonas reinhardtii (algae) (version 1.0). The genome databases were downloaded from Phytozome (version 11) (https://phytozome.jgi.doe.gov/pz/portal.html) for all species except for G. arboreum, G. hirsutum, G. raimondii and A. thaliana. The G. arboreum genome was downloaded from a publicly available online resource (ftp://bioinfo.ayit.edu.cn/downloads/), while the G. hirsutum and G. raimondii databases were downloaded from COTTONGEN (https://www.cottongen.org/). The A. thaliana database was downloaded from TAIR 10 (http://www.arabidopsis.org). The putative IDD protein sequences retrieved by Local BLASTP were further confirmed by using SMART (Letunic et al. 2015) (http://smart.emblheidelberg.de/), and InterProScan 63.0 program (http://www.ebi.ac.uk/InterProScan/) and Hidden Markov model (HMM) (Jones et al. 2014). Gene IDs and names were listed or given according to the positions on chromosomes (Additional file 1: Table S1). ExPASy ProtParam tool (http://us.expasy.org/tools/protparam.html) was employed to predict the biophysical characteristics and protein localization of all GhIDDs.
Phylogenetic tree construction and conserved IDD sequences analyses
Full length protein sequences of IDD genes from eight species (G. hirsutum, G. arboreum, G. raimondii, T. cacao, A. thaliana, O. sativa, P. patens, and S. moellendorffii) were aligned to test a phylogenetic tree by MEGA 7.0 program using ML (Maximum Likely hood) method (Kumar et al. 2016). To test the tree, bootstrap method with 1 000 repeats and 50% cutoff values were used. Further, two other phylogenetic trees of 110 IDD genes from three cotton species (G. hirsutum, G. arboreum, G. raimondii) and 65 IDD genes from G. hirsutum were also constructed using NJ (neighbor-joining) method (Kumar et al. 2016) by MEGA 7.0 program. Next, for conserved sequence logos analysis, multiple sequence alignment of IDD proteins of A. thaliana, rice, and upland cotton (G. hirsutum) was performed with Clustal X 2.0, and the results were subjected to WEBLOG online program (Crooks et al. 2004) to visualize conserved amino acid sequence logos.
Analyses of gene structures and conserved motifs
We performed an exon–intron structural and conserved motif analysis of 65 IDD gene of G. hirsutum. Sequences were first aligned using Clustal X 2.0, and then a phylogenetic tree was constructed using the NJ method by MEGA 7.0 program. To examine gene structures, the BED file along with the data from the NJ phylogenetic tree were subjected to GSDS 2.0 (Gene Structure Display Server 2.0) online tool (Hu et al. 2015) (http://gsds.cbi.pku.edu.cn/index.php). Motifs were examined by submitting full-length protein sequences to the MEME online program (Bailey et al. 2006) (http://memesuite.org/tools/meme), with parameters as described previously (Li et al. 2019).
Chromosomal mapping, gene duplication and Ka/Ks values
The chromosomal positions of GhIDDs were obtained from cotton genome annotation file (ftp://ftp.bioinfo.wsu.edu/species/Gossypium_hirsutum/NAUNBI_G), and gff3-file was extracted. The physical localization of GhIDD genes was mapped by using MapInspect program (Jia et al. 2018) (http://www.plantbreeding.wur.nl/UK/software_mapinspect.html) to visualize the distribution of the GhIDD genes on corresponding chromosomes. Orthologous and paralogous gene pairs of the GhIDD genes were obtained by all-versus-all BLASTP searches (Altschul et al. 1990). The blastp results were then analyzed by MCscan, which generated collinearity blocks for the cotton IDD genes between and within At and Dt sub-genomes of upland cotton. The collinear pairs of IDD genes generated by MCscan were used to construct a collinearity map of IDD genes using CIRCOS software (Krzywinski et al. 2009). To estimate Ka/Ks values, the amino acid sequences of orthologous gene pairs were first aligned by Clustal X2.0 and then converted to cDNA sequences using PAL2NAL program (Suyama et al. 2006) (http://www.bork.embl.de/pal2nal/). Further, non-synonymous (Ka) and synonymous (Ks) divergence level values were calculated by CODEML program of the PAML package (Yang 2007).
RNA-seq data analysis of GhIDD genes
To determine the expression patterns of the GhIDD genes in 22 different tissues (vegetation, reproduction and fiber) of cotton, we used publicly available high-throughput microarray data (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4482290/). TopHat and cufflinks were used to analyze the RNA-seq expression and the gene expressions were uniformed in fragments per kilobase million (FPKM) (Trapnell et al. 2012). The IDDs expression values were extracted from the expression data. Genesis software was used to generate the heat map (Sturn et al. 2002) of IDDs expression in various tissues and responses to abiotic stresses including cold, hot, salt (300 mmol·L− 1 NaCl) and 10% PEG 6000.
Plant material and treatments
Cotton seeds of CCRI24 were obtained from the Institute of Cotton Research of Chinese Academy of Agricultural Sciences. To analyze spatial and temporal expression patterns of genes, the different plant tissues such as root, stem, leaf, flower, ovules of 1, 3, 5, 7, 10, 15 and 20 DPA (day post anthesis) as well as fiber tissues of 7, 10, 15 and 20 DPA were collected for the RNA preparation from cotton plants, grown under field conditions (Zhengzhou, China). To investigate the expression of GhIDD genes under abiotic stresses, seeds were germinated on a wet filter paper for 3 days at 28 °C, and seedlings were transferred to a liquid culture medium (Yang et al. 2014). At the 3-leaf stage, the seedlings were treated at 4 °C and 38 °C for cold and heat stress, and with 10% PEG 6000 and 300 mmol·L− 1 NaCl, respectively; the true leaves were sampled at 0, 1, 2, 4, and 6 h of the treatments. The total RNA was extracted using RNAprep Pure Plant Kit (TIANGEN, Beijing, China), as per the manufacturer’s instructions. The first strand cDNA was synthesized using a Prime Script® RT reagent kit (Takara, Dalian, China). SYBR Premix Ex Taq™ II (Takara) was used for PCR amplifications. Premix Ex Taq™ II (Takara) was used along with the Light Cycler 480 system (Roche Diagnostics, Mannheim, Germany) for Real-time PCR. For each analysis, qRT-PCR assays had three biological replicates, each consisting of three technical replicates. Histone 3 from cotton (GeneBank, accession number AF024716) was used as an internal control (Wan et al. 2016). The relative fold difference value (N) was calculated as follows: N = 2 − ΔΔCt = 2 − (ΔCt treated − ΔCt control), where ΔΔCt = ΔCt of the treated sample − ΔCt of the untreated control sample. The primers used in this study were enlisted in Additional file 1: Table S2.
Genome-wide identification of IDD genes
We identified total of 162 genes in 8 investigated plant species including monocots (O. sativa), dicots (A. thaliana, G. hirsutum, G. arboreum, G. raimondii, and T. cacao), ferns and moss. However, no IDD gene family member was identified in algae. Among these, 65 IDD genes were confirmed in G. hirsutum, 22 in G. arboreum, 23 in G. raimondii, 15 in T. cacao, 12 in O. sativa, 7 in moss, and 2 in fern. Higher number of IDD genes was identified in G. hirsutum than that in G. arboreum, G. raimondii, T. cacao, rice, moss, fern and Arabidopsis indicating polyploidization and duplication effect on GhIDD genes in G. hirsutum.
Phylogenetic analysis of IDD gene family
Moreover, the phylogenetic tree results depicted the close relationship among cotton and cacao IDD genes, as the genes from these two species were found to be closely clustered to each other in different groups and subgroups of phylogenetic tree (Fig. 1). However, the number and distribution of IDD genes in cacao and cotton were different in all groups. For instance, in group IDD-G, 14 GhIDD genes showed a close relationship with two cacao IDD genes (TcIDD8 and TcIDD14), also supporting the hypothesis that cacao and cotton were closely related and probably derived from the same ancestors (Li et al. 2014).
Furthermore, to explore the evolutionary relationship and potential function catalogue among G. hirsutum IDD genes, another phylogenetic tree was constructed by NJ method. Total of 65 GhIDD genes were divided into five (IDD-a, IDD-b, IDD-c, IDD-d, and IDD-e) groups (Additional file 2: Figure S1). Group IDD-a was the biggest group with 21 GhIDD genes, however group IDD-b was the smallest with 6 GhIDD genes in it. Group IDD-c and IDD-d contained 16 and 8 genes, respectively. In group IDD-e, all (14) GhIDD genes are same with that in IDD-d of Fig. 2, which showed consistency in our analysis and strengthened the hypothesis that these IDD genes might originate from common ancestors of cotton and cacao.
Biophysical characteristics of GhIDD genes
We predicted the biophysical characteristics of all the members of GhIDD gene family in G. hirsutum. The details of biophysical properties including chromosomal position (start and end points), coding sequence (CDS), number of amino acids (protein length), molecular weight (MW), isoelectric point (pI), and grand average of hydropathicity (GRAVY) of GhIDD genes are provided in Additional file 1: Table S3.
The results indicated that GhIDD coding sequence ranged from 1 140 bp to 2 418 bp for GhIDD37 and GhIDD42, respectively. Similarly, the numbers of amino acids in the predicted protein sequences of GhIDD genes ranged from 379 to 805 for same genes. Molecular weights ranged from 41 310.77 to 89 465.69 kDa for GhIDD42 and GhIDD13, respectively. Isoelectric point of GhIDD41 was the highest (9.68) and that of GhIDD60 was the lowest of 8.37. The grand averages of hydropathicity values of all GhIDD genes were less than 0, ranging from − 0.843 to − 0.62 for GhIDD64 and GhIDD18, respectively. In addition, the predicted subcellular localization of the G. hirsutum IDD proteins were all in nuclear (Additional file 1: Table S3).
Gene structure and conserved motif analysis
Chromosomal distribution, gene duplication and synteny analysis
According to the Darwinian theory of natural selection, we investigated the non-synonymous divergence levels (Ka) versus synonymous divergence levels (Ks) for 142 duplicated gene pairs. It is found that 125 duplicated gene pairs showed Ka/Ks value less than 0.5, while 15 duplicated gene pairs Ka/Ks value was between 0.5 and 1 (Additional file 1: Table S4). However, only two duplicated gene pairs (GhIDD15-GhIDD48 and GhIDD23-GhIDD56) showed Ka/Ks value greater than 1. From above, the Ka/Ks values of most of duplicated gene pairs were less than 1 indicating that the upland cotton IDD gene family underwent a strong purifying selection pressure with limited functional divergence. That might be occurred after segmental and whole genome duplication (WGD) event during polyploidization followed by hybridization in the evolutionary history.
Conserved amino acid residues
Spatial and temporal expression pattern of GhIDD genes
Plant IDD gene family has an important role in plant growth and development such as root development (Ingkasuwan et al. 2012; Yoshida et al. 2014), sugar and starch metabolism during flower transition in maize, rice and Arabidopsis (Colasanti et al. 2006; Ingkasuwan et al. 2012; Matsubara et al. 2008; Park et al. 2008; Wong and Colasanti 2007; Wu et al. 2008). Spatiotemporal expression of transcript is tightly correlated with the biological function of a specific gene. To investigate the tissue specific expression patterns of different IDD genes, RNA-seq data were downloaded from NCBI to generate heat map. We noted that all the genes were clustered according to their expression patterns in the vegetative organs (root, stem, and leaf), reproductive organs (torus, petal, stamen, pistil, and calycle), ovule (− 3, − 1, 0, 1, 3, 5, 10, 20, 25 and 35 DPA) and fiber (5, 10, 20, and 25 DPA) (Additional file 3: Figure S2). Heat map displayed that most GhIDD genes showed a ubiquitous expression pattern in different observed tissues and minority showed much lower expression level. Only GhIDD7 and GhIDD38 displayed the specific expression in stamen.
Responses of GhIDD genes under various abiotic stresses
Plant often faces the various stresses such as heat, cold, drought, and high salinity which influence the plant growth and productivity. These stresses induce or repress the expression of various genes ‘effect on’ genes functions related to plant growth and development. To investigate the responses of GhIDD genes under different abiotic stresses, RNA-seq data were downloaded from NCBI and a heat map depicting different responses was constructed (Additional file 4: Figure S3). RNA-seq data revealed that all the genes were clustered according to their different responses under specific abiotic stresses, which indicated the positive and negative regulating roles of GhIDD genes under different abiotic stresses.
The C2H2 transcription factors family, encoded by IDD genes, is one of the biggest plant gene families, and plays an important function in plant development and growth. In previous studies, identification of IDD gene family in rice, maize, Arabidopsis, and apple have been performed (Colasanti et al. 2006; Fan et al. 2017). But the genome-wide identification and analysis of IDD genes have not been performed in cotton till now. In present, a comprehensive identification and analysis of IDD genes in G. hirsutum, G. arboreum, G. raimondii, T. cacao, A. thaliana, O. sativa, P. patens (moss), and S. moellendorffii (fern) were performed. The IDD genes in allotetraploid cotton G. hirsutum were focused to understand the roles of IDD gene family in cotton development.
A phylogenetic analysis was applied to determine the evolutionary relationship from eight species. No IDD gene family member was identified in algae indicating that the first IDD gene was originated in a moss, which is agreed with the result of a previous study (Wu et al. 2016). A total of 162 IDD genes were divided into seven different groups (IDD-A, IDD-B, IDD-C, IDD-D IDD-E, IDD-F, and IDD-G), which revealed that most of cotton IDD genes showed more close relationship with cacao IDD genes and predicted that cotton and cacao are evolved from common ancestors. Additionally, another phylogenetic tree was constructed from two diploid and an allotetraploid cotton species to confirm the evolutionary relationship among them. Phylogenetic tree among three cotton species divided IDD genes into four groups from IDD-a to d. Among these, group IDD-d had only 14 GhIDD genes that might be the result of introgression during the hybridization and polyploidization. Further, these results also strengthen the previous findings that G. hirsutum was evolved from the hybridization of A and D genomes cotton (G. arboreum and G. raimondii, respectively) as most of IDD genes from all three cotton species were closely distributed in phylogenetic tree (Li et al. 2014). To deeply understand the evolutionary history of GhIDD genes, another phylogenetic tree was constructed among GhIDD genes and 65 GhIDD genes were distributed into five groups. Consistent with our findings, group IDD-e contained the same 14 GhIDD genes as previous phylogenetic analysis (Additional file 2: Figure S1). All these indicated that some IDD genes (14 genes in IDD-e of Fig. 2) are very ancient and important in the plant evolution and development, which may comprise the core gene resources in the plant.
Biophysical characteristics and chromosomal location
The prediction of the biophysical characteristics of all GhIDD gene family members provided valuable information to us too. Biophysical characteristics of all 65 GhIDD genes identified in G. hirsutum predicted that GhIDD genes were all located in nuclear. The values of isoelectric and grand average of hydropathicity (GRAVY) of the 65 GhIDDs suggested that all IDD proteins were alkaline and hydrophilic. These results were in accordance with previous genome wide study of IDDs in apple, which depicted alkaline and hydrophilic nature of all identified IDD genes with isoelectric point values more than 7 and grand average of hydropathicity (GRAVY) values less than 0 (Fan et al. 2017).
Furthermore, the 65 identified GhIDD genes were distributed on 21 At and Dt sub-genome chromosomes of upland cotton, and didn’t display obvious sub-genome bias. Where 30 GhIDD genes out of 65 were noticed to be located on 10 At chromosomes (A2, A3, A4, A5, A6, A8, A9, A10, A11, and A12) and 33 on 11 Dt chromosomes (D1, D2, D3, D4, D5, D6, D8, D9, D10, D11 and D12). The remaining two genes (GhIDD1 and GhIDD65) were distributed on two unoriented scaffolds. The reason for uneven distribution of GhIDD genes on 21 chromosomes of At and Dt sub-genome of G. hirsutum is the addition or loss of genes during long evolutionary history of G. hirsutum.
Conserved amino acid residues, protein motifs and gene structure analysis
Furthermore, conserved amino acid residues analysis of IDD conserved domain from O. sativa, A. thaliana, and G. hirsutum revealed that the IDD domain was highly conserved in monocotyledons and dicotyledons during the phase of evolution. In addition, a total of 10 motifs were identified which indicated that IDD proteins may function in divergent physiological pathways associated with different co-factors. Motifs distribution of IDD proteins suggested that IDD proteins motif distribution was relative conserved, and minor differences among the proteins from different groups might be associated with particular functions related to growth, development and stress tolerance in cotton. In detail, the motif 5 and 7 are conserved in the IDD-a, b, c and d subfamilies but no IDD-e subfamily, while the motif 6 and 10 only distributed in the proteins of IDD-e subfamily (Additional file 2: Figure S1), indicating that the gene evolution or duplication is correlated with the gene function variation.
Gene structure (exon–intron) is important that might be contributed by insertion/deletion events (Lecharny et al. 2003). Several genome-wide studies proved that the loss or gain of introns during eukaryotic diversification was extensive (Rogozin et al. 2003; Roy and Penny 2007). Gene structure analysis showed that duplicated genes have similar gene structure, while intron length varies among genes indicating that intron length might play major roles in the functional diversification of GhIDD genes.
It is reported that introns play a vital role for the evolution of different plant species (Roy and Gilbert 2006). Here, we found that the number of introns varies from one to eight, however most genes showed two to three introns in their gene structure indicating that G. hirsutum is a newly evolved species with less number of introns, which supported previous study that large number of introns decreased over time during an early expansion stage (Roy and Penny 2007), and suggested that newly evolved species have less number of introns as compared with their primitive species (Roy and Gilbert 2006).
Gene duplication and selection pressure
We identified 65 GhIDD genes in the upland cotton genome, which were more in numbers than that previously identified in Arabidopsis, maize, rice, and apple. The main reason for larger number of IDD genes is that the upland cotton experienced polyploidization. Polyploidization was an important event for the evolution of cotton and contributed to gene duplication (Paterson et al. 2004). G. hirsutum is an allotetraploid cotton which is evolved as the result of hybridization of G. arboreum (A2 genome) and G. raimondii (D5 genome), and an important plant species for studying polyploidization (Wendel and Cronn 2003). The At and Dt sub-genome donors (G. arboreum and G. raimondii, respectively) of upland cotton are close relatives sharing the same number of orthologs, and resulted in duplication and doubling numbers of GhIDD genes in upland cotton. Accordingly, the numbers of IDD genes in G. arboreum and G. raimondii are 22 and 23, respectively, less than one half of G. hirsutum.
In previous studies, it is clear that gene duplication and diversification played an important role in the evolution. The gene duplications were always found in many plants and usually consisted of tandem, segmental, and whole genome duplications (Xu et al. 2012). Tandem duplication event occurred when two or more genes located on same chromosome, while segmental duplication event occurred between different chromosomes (He et al. 2012). Many transcription factor gene families including AP2, WOX, YABBY, RH2FE3, and GRAS genes underwent segmental duplication and attributed the gene family expansion and functional divergence in cotton (Liu and Zhang 2017; Qanmber et al. 2018; Yang et al. 2017; Yang et al. 2018; Zhang et al. 2018). In our study, 54 out of 142 duplicated gene pairs were associated with segmental duplication while four with tandem duplication contributed to the expansion of GhIDDs besides the diversification of GhIDD gene structure and function (Additional file 1: Table S4).
Many gene families have expanded too much higher numbers in plants than in other eukaryotes, suggesting that these expansions correlate with environmental pressure and selection pressure. To estimate the environmental pressures and selection pressure, non-synonymous (Ka) and synonymous substitution (Ks) rates of substitution (Ka/Ks) was calculated. Generally, Ka/Ks > 1, Ka/Ks = 1, and Ka/Ks < 1 indicate positive selection, neutral evolution, and purifying selection, respectively. In this study, we found that most Ka/Ks values of the GhIDD genes were smaller than 1.0 indicating that GhIDD gene family underwent a strong purifying selection pressure.
GhIDD genes expression in specific tissues and under different stresses
It has been reported that the IDD genes had essential functions in plant growth and development. The AtIDD9 and AtIDD10 interacted with DELLA which were used as scaffolds to mediate GA signaling pathways (Hassan et al. 2010). AtIDD9 also plays an important role in epidermal cell fate specification in root (Long et al. 2015a; Long et al. 2015b). Moreover, It has been noted that AtIDD10 acts upstream of root hair to regulate the accurate alternate pattern of N and H cells around cortex cells (Hassan et al. 2010). The phylogenetic analysis of IDD genes in apple revealed that IDD genes mediated flower induction (Fan et al. 2017). In rice, the IDD homolog LOOSE PLANT ARCHI-TECTURE1 (LPA1/OsIDD16/IDD18) also affects shoot response to gravity by modulating auxin flux in a brassinosteroid-dependent manner (Wu et al. 2013; Xuan et al. 2013).
Here, we analyzed the spatiotemporal expression of GhIDD genes in different tissues by Q-PCR. The results showed that most genes expressed peak in the ovule of 7 DPA, indicating their potential roles in the fiber elongation stage. GhIDD2 may be as a constitutive regulator with its ubiquitous expression pattern. GhIDD4, GhIDD32, GhIDD33, and GhIDD48 may play different roles in cotton vegetative and reproductive development with their distinct expression patterns. Thus, our results indicated that GhIDD genes demonstrated substantial functional diversity during cotton development and suggested that GhIDD genes are playing important function in seed or fiber development.
Previously, it has been reported that IDD genes functions were related to flower transition and epidermal cell development, but there is no report of IDD genes function under different abiotic stresses. Thus, to find whether they might play some roles in stress response, the responses of GhIDD genes under various abiotic stresses were determined. In our study, we found that the GhIDD15, GhIDD21, GhIDD32, GhIDD33, GhIDD42, and GhIDD48 expressions were upregulated under all treated abiotic stresses suggesting that these genes might play positive and important roles under the exposure of different abiotic stresses. In contrast, GhIDD2 gene expression was down-regulated under all abiotic stresses indicating the negative response for abiotic stress. Further, GhIDD4 and GhIDD9 were down-regulated in response to heat and NaCl indicating that these genes might play negative role in response to heat and NaCl. However, GhIDD11 is up-regulated in response to 2 h cold and 6 h PEG treatments. Whereas the expression level of GhIDD4, GhIDD7, GhIDD11, and GhIDD21 were upregulated in response to PEG treatment at 6 h, indicating that these genes might play a role in a type of long-term dehydration tolerance and not as the instant sensors for abiotic stress signaling. In a word, most of the IDD genes were induced by different abiotic stresses, indicating that GhIDD genes might meditate the abiotic stress responses. Although IDD genes showed different expression levels under different stresses, there is no study on the function of cotton IDDs in stress. Therefore, there is need to investigate the functions of IDD genes under abiotic stresses in future studies. In short, our results showed that GhIDD genes may play an important role in plant vegetative development, seed and fiber development and might be proved important regulator in abiotic stresses tolerance of cotton.
IDD gene family plays significant role in plant growth and development. We identified 65 IDD genes in upland cotton genome that were deliberately investigated in gene phylogenetic evolution, gene structure variation, transcriptional expression pattern, prediction of protein motifs, subcellular localization and other characteristics. The phylogenetic analysis of IDD genes confirmed the close relationship of cotton and cacao, as the cotton and cacao were derived from the common ancestors. Collinearity analysis verified the expansion and evolution of GhIDD genes. Furthermore, the spatial and temporal expression patterns in different tissues revealed their diverse functions in cotton development along with their essential roles in ovule and fiber development. Most GhIDD genes transcript levels were high in 7 DPA ovule tissues indicating the potential pivotal roles in seed development and fiber elongation. Moreover, most of IDD gene family members showed positive responses under various tested abiotic stresses suggesting that GhIDD genes are involved in mediating abiotic stress response. Our study puts light on cotton GhIDD genes and provides basic information which will not only help to understand the evolutionary history of cotton IDD genes, but also be helpful to provide excellent candidate genes for genetic engineering to improve abiotic stress tolerance and fiber quality in cotton.
We thank HUO Peng (Zhengzhou Research Center, Institute of Cotton Research of CAAS, Zhengzhou) for technical assistance.
This work was supported by the Major Research Plan of National Natural Science Foundation of China (NO.31690093), Creative Research Groups of China (31621005) and the Agricultural Science and Technology Innovation Program Cooperation and Innovation Mission (CAAS-XTCX2016).
Availability of data and materials
All data generated or analyzed in this study included in published article and additional files.
Li FG and Wang Z conceived and designed the study; Ali F, Qanmber G and Li YH carried out the experiments; Ma SY, Lu LL, Yang ZR analyzed and interpreted the data; Ali F and Wang Z prepared the manuscript. All the authors have read, edited, and approved the current version of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- Cui D, Zhao J, Jing Y, et al. The Arabidopsis IDD14, IDD15, and IDD16 cooperatively regulate lateral organ morphogenesis and gravitropism by promoting auxin biosynthesis and transport. PLoS Genet. 2013;9:e1003759. https://doi.org/10.1371/journal.pgen.1003759.PubMedPubMedCentralCrossRefGoogle Scholar
- Jeong EY, Seo PJ, Woo JC, et al. AKIN10 delays flowering by inactivating IDD8 transcription factor through protein phosphorylation in Arabidopsis. BMC Plant Biol. 2015;15:110. https://doi.org/10.1186/s12870-015-0503-8.
- Jia J, Zhao P, Cheng L, et al. MADS-box family genes in sheepgrass and their involvement in abiotic stress responses. BMC Plant Biol. 2018;18:42. https://doi.org/10.1186/s12870-018-1259-8.
- Lecharny A, Boudet N, Gy I, et al. Introns in, introns out in plant gene families: a genomic approach of the dynamics of gene structure. In: Meyer A, Van de Peer Y, eds. Genome Evolution: Gene and Genome Duplications and the Origin of Novel Gene Functions. Dordrecht: Springer Netherlands. 2003;111–6.CrossRefGoogle Scholar
- Liu CX, Zhang TZ. Expansion and stress responses of the AP2/EREBP superfamily in cotton. BMC Genomics. 2017;18:118. https://doi.org/10.1186/s12864-017-3517-9.
- Long Y, Smet W, Cruz-Ramirez A, et al. Arabidopsis BIRD zinc finger proteins jointly stabilize tissue boundaries by confining the cell fate regulator SHORT-ROOT and contributing to fate specification. Plant Cell. 2015b;27:1185–99. https://doi.org/10.1105/tpc.114.132407.PubMedPubMedCentralCrossRefGoogle Scholar
- Park SJ, Kim SL, Lee S, et al. Rice Indeterminate 1 (OsId1) is necessary for the expression of Ehd1 (early heading date 1) regardless of photoperiod. Plant J. 2008;56:1018–29.Google Scholar
- Qanmber G, YU DQ, LI J, et al. Genome-wide identification and expression analysis of Gossypium RING-H2 finger E3 ligase genes revealed their roles in fiber development, and phytohormone and abiotic stress responses. Journal of Cotton Research. 2018;1:1. https://doi.org/10.1186/s42397-018-0004-z.
- Roy SW, Penny D. Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol Biol Evol. 2007;24:171–81.Google Scholar
- Yang ZE, Gong Q, Qin WQ, et al. Genome-wide analysis of WOX genes in upland cotton and their expression pattern under different stresses. BMC Plant Biol. 2017;17:113. https://doi.org/10.1186/s12870-017-1065-8.
- Yang ZE, Gong Q, Wang LL, et al. Genome-wide study of YABBY genes in upland cotton and their expression patterns under different stresses. Front Genet. 2018;9:33. https://doi.org/10.3389/fgene.2018.00033.
- Zhang B, Liu J, Yang ZE, et al. Genome-wide analysis of GRAS transcription factor gene family in Gossypium hirsutum L. BMC Genomics. 2018;19:348. https://doi.org/10.1186/s12864-018-4722-x.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.