Introduction

The DOF (DNA-binding one zinc finger) protein family is a plant-specific transcription factor family. A DOF protein was first identified in maize (Zea mays) (Yanagisawa and Izui 1993), and later found in many other plant species. Unlike other zinc-finger domains, the DOF domain includes a C2C2-type zinc-finger-like motif (Yanagisawa 2002). Most DOF proteins consist of 200–400 amino acids and have the DOF domain with 50–52 amino acids at their N-terminal regions. The DOF domain consists of 50–52 amino acids and can bind a cis-regulatory elements with the core sequence 5′-T/AAAG-3′ (Yanagisawa 2002). In contrast to the DOF domain, C-terminal regions of DOF proteins are variable.

DOF proteins regulate their target genes and thereby regulate various biological processes. There are 36 DOF genes in Arabidopsis thaliana, 30 in rice (Oryza sativa) (Lijavetzky et al. 2003), 28 in sorghum (Kushwaha et al. 2011), 34 in tomato (Cai et al. 2013), 37 in Chinese cabbage (Ma et al. 2015), 32 in potato (Venkatesh and Park 2015), 38 putative DOF genes in pigeon pea (Malviya et al. 2015), 36 in cucumber (Wen et al. 2016), 33 in pepper (Kang et al. 2016), 36 in common beans (Ito et al. 2017), 35 in foxtail millet (Zhang et al. 2017), 41 in poplar (Wang et al. 2017), 114 in cotton (Li et al. 2018), 29 in eggplant (Wei et al. 2018), 24 in durian (Khaksar et al. 2019), 40 in alfalfa (Cao et al. 2020), 33 in tef (Mulat and Sinha 2020), 36 in watermelon (Zhou et al. 2020), 50 in Cleistogenes songorica (Wang et al. 2021), 39 in common walnut (Khan et al. 2021), 117 in rapeseed (Lohani et al. 2021), and 51 in olive (Mariyam et al. 2021). However, no DOF genes have been identified in pearl millet (Pennisetum glaucum (L.) R. Br., also known as Cenchrus americanus (L.) Morrone).

Pearl millet is a cross-pollinated diploid C4 crop with seven pairs of chromosomes (2n = 14) and a ~ 1.79-Gb genome (Varshney et al. 2017). Pearl millet is one of the most cultivated millets in the world. Compared with other crops such as maize and wheat (Triticum aestivum), pearl millet exhibits higher tolerance to stressed conditions such as a drought, low soil fertility and a high temperature (Basavaraj et al. 2010). In this study, we performed a comprehensive analysis of pearl millet DOF genes (PgDOFs) as candidates for regulators of pearl millet stress tolerance.

Materials and Methods

Plant Materials

The pearl millet drought-tolerant genotype ICMB 843 (Dudhate et al. 2018) was provided by ICRISAT (International Crops Research Institute for the Semi-Arid Tropics, India) and used for this study. Approximately 60 seeds were sown on soil with fertilizers in pots. The resulting plants were grown under a long-day (16-hour light/8-hour darkness) condition at 28 °C in a growth chamber. For gene expression analysis using quantitative real time PCR (qRT-PCR), 28-day-old seedling were subjected to (a) dehydration stress induced by PEG-6000 15% (w/v), (b) salinity stress by 250mM NaCl, (c) high temperature stress of 42 °C, and (d) low temperature stress of 4 °C for 6, 12 and 24 h. Leaves and roots of these plants were sampled and stored at -80 °C for RNA isolation followed by cDNA synthesis.

Identification and Annotation of PgDOFs in Pearl Millet

To find PgDOFs, the DOF sequences of Arabidopsis, rice, and foxtail millet from the Uniprot (https://www.uniprot.org) database, using the keyword “DOF”. Alignment of these protein sequences were performed by Clustal W (Thompson et al. 1994). A profile of this multiple sequence alignment was generated by hmmbuild and used for hmmsearch on HMMER (v3.32) with the pearl millet protein sequences (Varshney et al. 2017) (https://cegsb.icrisat.org/ipmgsc/genome.html) as queries. This search returned 37 protein sequences with that profile. The presence of the DOF domains in these 37 protein sequences was confirmed by CD-search (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) and Pfam (https://pfam.xfam.org/family/PF02701). These detected DOF domains in 12 of those 37 sequences, and these 12 proteins were regarded as PgDOFs. Logos for the DOF domain sequences were obtained by WebLogo 2.8.2 (http://weblogo.berkeley.edu/logo.cgi) (Crooks et al. 2004). The isoelectric point (pI) and molecular weight (MW) of PgDOFs were obtained by ProtParam (https://web.expasy.org/protparam/) (Gasteiger et al. 2005). Subcellular localization of PgDOFs was predicted by WoLF PSORT (https://wolfpsort.hgc.jp) (Horton et al. 2007). Annotations of Arabidopsis thaliana and rice homologs of PgDOFs were downloaded from TGIF-DB (http://webpark2116.sakura.ne.jp/rlgpr/) (Tsugama et al. 2021).

Chromosomal Localization and Gene Structure Analysis

Positions of PgDOFs on chromosomes were displayed by Map Gene 2 Chromosome (MG2C) v2 (http://mg2c.iask.in/mg2c_v2.0/) (Chao et al. 2021). The exon-intron structures of PgDOFs were obtained by comparing their cDNA sequences with their genomic sequences on Gene Structure Display Server 2.0 (http://gsds.gao-lab.org) (Hu et al. 2015).

Motifs Analysis and Phylogenetic Tree Construction

The MEME Suite 5.5.0 (http://meme-suite.org/tools/meme) (Bailey et al. 2015) was used to examine the conserved motifs in PgDOFs. The maximum number of motifs was set at 15 and the minimum and maximum width at 6 and 50 amino acids, respectively. Eukaryotic linear motifs (ELMs) (Kumar et al. 2022) for PgDOF1-12 were downloaded from TGIF-DB (Tsugama et al. 2021). A phylogenetic tree of PgDOFs and DOF proteins of Setaria itatica (foxtail millet), Oryza sativa (rice), sorghum, maize, and Arabidopsis was generated with the ClustalW multiple-alignment program and the neighbor-joining method, and evaluated by bootstrapping with 1000-time resampling. The resulting phylogenetic tree was displayed on iTOL v6 (Letunic and Bork 2019).

Synteny Analysis

All amino acid sequences as well as general feature-formatted (GFF) genome annotation files of pearl millet, Arabidopsis, rice, maize and foxtail millet were downloaded from the International Pearl Millet Genome Sequencing Consortium website (https://cegsb.icrisat.org/ipmgsc/), The Arabidopsis Information Resource (TAIR) (https://www.arabidopsis.org/index.jsp), the Rice Genome Annotation Project website (http://rice.uga.edu/), the Ensembl Plants Zea mays website (http://plants.ensembl.org/Zea_mays/Info/Index) and the Ensembl Plants Setaria italica website (https://plants.ensembl.org/Setaria_italica/Info/Index), respectively. BLASTP in the BLAST + suite (Camacho et al. 2009) was run with all combinations of those sequences as a query and a database. For this analysis, the option was set as “-outfmt 6 -num_alignments 5” to use the tabular output format and five as the maximum alignment number per query. The resulting files and the above GFF files were concatenated and used as the input for MCScanX (Wang et al. 2012) to analyze colinear syntenic blocks of genes. Default options were used for MCScanX.

In silico Analysis of cis-acting Elements in Promoters of PgDOFs

Two thousand-bp upstream sequences from start codons were used as promoters as previously described (Zhang et al. 2021) for PgDOFs. Such sequences were extracted from the pearl millet genome sequence. Known cis-acting elements in these sequences were identified by PlantCare (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) (Lescot et al. 2002).

RNA Extraction and Expression of PgDOFs Under Stressed Conditions

Total RNA was prepared from leaves and roots of the stressed 28-day-old pearl millet plants (see the above subsection “Plant materials”) with the NucleoSpin RNA Plant kit (MACHEREY-NAGEL, Germany). cDNA was prepared from 1 µg of the total RNA with PrimeScript Reverse Transcriptase (Takara Bio, Japan) and the oligo (dT) primer, and used as the template for qRT-PCR. StepOne Real-Time PCR System (Applied Biosystems, USA) and TB Green Premix Ex Taq (Takara Bio, Japan) were used to perform the qRT-PCR. The PCR cycle used was 95 °C for 30 s, 40 cycles of 95 °C for 5 s, and 60 °C for 30 s. Primers used for the analysis are listed in Table S1. The pearl millet Actin (ACT) gene (Anup et al. 2017) was used as the internal control.

Results

Identification of PgDOFs

Twelve PgDOFs (PgDOF1-12) were identified by our analysis. Among PgDOF1-12, PgDOF12 was the smallest with 262 amino acids, and PgDOF4 was the largest with 720 amino acids. The pI of PgDOF1-12 was between 4.62 (for PgDOF7) and 11.92 (for PgDOF12). PgDOF1, PgDOF2, PgDOF4, and PgDOF12 were predicted to be localized to the plastids, whereas the other PgDOFs were predicted to be localized to the nucleus (Table 1).

Table 1 The 12 PgDOFs identified in pearl millet and their sequence features

PgDOF1-10 were located on four of the seven pearl millet chromosomes: Three on Chromosome 2, three on Chromosome 5, three on Chromosome 6, and one on Chromosome 7 (Fig. 1). PgDOF11 and PgDOF12 were located on scaffolds (sequences that are unassigned to chromosomes). These genes were physically distant from each other, suggesting that none of them is a tandemly duplicated gene. PgDOF1-10 and genes in their proximity were present in colinear syntenic blocks with their homologs from rice, maize and foxtail millet, whereas PgDOF11 and PgDOF12 were not (Dataset S1). This result suggests that at least PgDOF1-10 can be segmentally duplicated genes.

Fig. 1
figure 1

Positions of PgDOFs in pearl millet chromosomes. The chromosome number is indicated at the top of each bar in red. The scale is represented in megabase (Mb)

Phylogenetic Classification and Protein Sequence Alignment of PgDOFs

A phylogenetic tree was generated, including PgDOFs and DOF proteins from other five species: foxtail millet, rice, sorghum, maize, and Arabidopsis. The tree suggests that these DOF proteins could be divided into seven groups (Group I-VII) (Fig. 2). Group III did not contain any of PgDOF1-12 but the other groups contained at least one of PgDOF1-12. The alignment of protein sequences showed that PgDOFs have the conserved DOF domain at their N-terminal regions and their C-terminal regions are variable (Fig. S1).

Fig. 2
figure 2

Neighbor-joining phylogenetic tree of DOF proteins from pearl millet (PgDOF), with foxtail millet (SiDOF), rice (OsDOF), sorghum (SbDOF), maize (ZmDOF), and Arabidopsis (AtDOF). PgDOFs are indicated by arrows. The details about the databases to download the sequences are listed in Table S4

Exon-intron Structure and Conserved Motif Analysis

PgDOF3 and PgDOF10 contained only one exon, PgDOF4 contained six exons, and the other PgDOFs contained two or three exons (Fig. 3a).

Fig. 3
figure 3

(a) Exon-intron structures of PgDOFs. Exons and introns are indicated by red rectangles and black horizontal lines, respectively. (b) Motifs identified in PgDOFs. Motifs indicated by boxes were detected by MEME and the number in boxes (1 to 15) represents Motif 1 to 15, respectively. Box sizes indicate the length of the motifs, and the consensus sequences of these motifs are presented in Table S2

Fifteen conserved motifs (Motif 1–15) were identified de novo in PgDOF1-12 by the MEME program (Table S2). Motif 1 (Fig. S2a) and Motif 3 (Fig. S2b) were found in N-terminal regions of most of the PgDOFs. PgDOF12 contained only one (Motif 1) of those motifs, whereas PgDOF9 contained 10 of those motifs (Fig. 3b). The differences in motif distribution between PgDOF1-12 may be relevant to their functional divergence. ELMs, which are motifs known to be involved in regulating protein functions such as protein modification, protein-protein interactions and subcellular localization (Kumar et al. 2022), were also identified in PgDOF1-12 (Dataset S2).

Analysis of cis-acting Elements in Promoters of PgDOFs

In the 2000-bp promoters of PgDOFs, cis-acting elements were identified such as light-responsive elements (TCT-motif and G-box), plant hormone-responsive elements (including abscisic acid-responsive elements, methyl jasmonate-responsive elements, gibberellin-responsive elements, and salicylic acid-responsive element), stress-responsive elements (including anaerobic-induction elements, low temperature-responsive elements, MYB-binding elements, and W-Box (WRKY-binding elements), plant growth associated elements (including the CAT-box for meristem-specific gene expression and the RY-element for seed-specific expression and the circadian rhythm) (Fig. 4 and Table S3).

Fig. 4
figure 4

Cis-acting elements in 2000-bp promoters of PgDOFs. The information on these cis-elements is given in Table S3

Expression Pattern of PgDOFs Under Various Abiotic Stresses

The expression patterns of PgDOF1-12 were analyzed by qRT-PCR. All PgDOFs were upregulated by PEG (polyethylene glycol)-induced dehydration stress in leaves. PgDOF4 expression was ~ 100 times stronger 6 and 12 h after the stress treatment was initiated than 0 h. PgDOF8 expression was ~ 400 times stronger 12 and 24 h after the treatment was initiated. In contrast, PgDOF4, PgDOF5 and PgDOF8 were suppressed by the same treatment in roots (Fig. 5a). Under a salinity (NaCl)-stressed condition, all the PgDOFs except PgDOF8 were upregulated in leaves. Among them, PgDOF5 was most greatly upregulated (with 400-fold increase 12 h after the stress was imposed). However, only seven out of 12 PgDOFs were upregulated by the salinity stress treatment in roots. PgDOF5 was even suppressed in the roots (Fig. 5b). A high temperature (42ºC) upregulated all the PgDOFs in both leaves and roots (Fig. 5c). A low temperature (4ºC) upregulated all the PgDOFs except PgDOF2 and PgDOF12 in leaves, and upregulated all of them in roots (Fig. 5d). Thus, most of the PgDOFs were upregulated by the stress treatments in leaves and/or roots. This result is consistent with the finding that PgDOFs have stress-responsive cis-elements in their promoters (Fig. 4 and Table S3).

Fig. 5
figure 5

(a) The expression levels of PgDOFs in leaves and roots in the presence of dehydration (15% (w/v) PEG) stress. (b) The expression levels of PgDOFs in leaves and roots in the presence of salinity (250 mM NaCl) stress. (c) The expression levels of PgDOFs in leaves and roots in the presence of heat (42 °C) stress. (d) The expression levels of PgDOFs in leaves and roots in the presence of cold (4 °C) stress. Four-week-old plants were treated with PEG, NaCl, 42 °C or 4 °C for 6 h, 12 h, and 24 h. The expression levels were calculated by the comparative cycle threshold method. The Actin (ACT) gene was used as an internal control. Data are means ± SD.

Discussion

The DOF gene family contains approximately 30 members in Arabidopsis, rice, maize and other species. However, this study identified only 12 putative DOF genes, which could be classified into six of those seven groups, in the current version of the pearl millet genome sequence. Some DOF genes in pearl millet may have been lost in the course of evolution, or the current pearl millet genome sequence may have insufficient or incorrect parts.

Most PgDOFs contain zero-two introns (Fig. 3a). This number is consistent with the number of introns of DOF genes of other plant species. PgDOF1-10 were present in a synteny block with their homologs in rice, maize and foxtail millet (Dataset S1). These findings support the idea that these DOF genes are evolutionarily conserved, and their functions can also be conserved. For example, ZmDOF1, a maize DOF gene, regulates carbohydrate metabolism (Yanagisawa 2004). TaDOF1, a wheat DOF gene, also regulates the carbon metabolism (Chen et al. 2005). AtDOF4.7, an Arabidopsis DOF gene, regulates the floral organ abscission (Wei et al. 2010). PpDOF1, a Physcomitrella patens DOF gene, regulates the growth of nutrient-dependent filament (Sugiyama et al. 2012). BnCDF1, a Brassica napus DOF gene, regulates flowering time and freezing tolerance (Xu and Dai 2016). OsDOF3, a rice DOF gene, regulates the biosynthesis of gibberellins (Li et al. 2009). OsDOF12, OsDOF23, OsDOF24 and OsDOF25, four rice DOF genes, regulate flowing time, gene expression in seeds, and carbon and nitrogen metabolism (Yamamoto et al. 2006). Sorghum DOF (SbDOF) genes are responsive to light and hormones, and regulate endosperm-specific gene expression (Kushwaha et al. 2011). PgDOFs may have those functions. Among the 15 conserved motifs identified de novo, Motif 10 was present in only PgDOF4 and PgDOF5 (Fig. 3b). It can be interesting to determine whether Motif 10 has a specific function.

PgDOF2, PgDOF6, PgDOF8, PgDOF9, and PgDOF10 were upregulated by the stress treatments (Fig. 5). These are all homologs of cycling DOF factor (CDF) genes of Arabidopsis and rice (Dataset S3 and S4), although PgDOF8 is distant from PgDOF2, PgDOF6, PgDOF9 and PgDOF10 in the phylogenetic tree (Fig. 2). In a previous study, overexpression of tomato (Solanum lycopersicum) CDF genes, SlCDF1 and SlCDF3, in Arabidopsis activated stress-responsive genes, COR15, RD29A and ERD10, and enhanced its tolerance to drought and salinity stresses (Corrales et al. 2014). In another study, a transfer DNA (T-DNA) insertion mutation in an Arabidopsis CDF gene, CDF3, decreased plant tolerance to drought and cold stresses, and CDF3 overexpression activated COR15, RD29A and ERD10 and their upstream transcription factor genes, CBF1-3, DREB2A, ZAT10, and ZAT12 to enhance the tolerance to drought and cold stresses, suggesting that CDF3 functions as a positive regulator of drought and cold stress tolerance upstream of those transcription factors (Corrales et al. 2017). In a previous transcriptome analysis, among PgDOF1-12, PgDOF6 as well as PgDOF12, which has no close homologs in either Arabidopsis or rice, was found to be expressed more strongly in the drought-tolerant line ICMB 843 than in the drought-sensitive line ICMB 863 in roots under a drought-stressed condition (Dudhate et al. 2018; Fig. S3). Thus, PgDOF6 and the other putative CDF genes of pearl millet (i.e., PgDOF2, PgDOF8, PgDOF9, and PgDOF10) are candidates for DOF genes that regulate the pearl millet stress tolerance. Arabidopsis CDFs repress the expression of a circadian clock-related gene, CONSTANS, to inhibit photoperiod-dependent flowering (Imaizumi et al. 2005; Fornara et al. 2009), and both the transcript and protein levels of the Arabidopsis CDFs are regulated by components of the circadian clock (Imaizumi et al. 2005; Sawa et al. 2007; Fornara et al. 2009). It will also be interesting to determine whether PgDOF2, PgDOF6, PgDOF8, PgDOF9, and PgDOF10 are involved in the circadian rhythm and flowering. The circadian rhythm-related cis-elements in the promoters of PgDOF2 and PgDOF9 (Fig. 4 and Table S3) may be relevant to their expression patterns during a day.