Genome-wide Identification and Abiotic Stress Response Pattern Analysis of NF-Y Gene Family in Peanut (Arachis Hypogaea L.)

The nuclear factor Y (NF-Y) transcription factor (TF) family consists of three subfamilies NF-YA, NF-YB and NF-YC. Many studies have proven that NF-Y complex plays multiple essential roles in stress response in Arabidopsis and other plant species. However, little attention has been given to these genes in peanut. In this study, thirty-three AhNF-Y genes were identified in cultivated peanut and they were distributed on 16 chromosomes. A phylogenetic analysis of the NF-Y amino acid sequences indicated that the peanut NF-Y proteins were clustered in pairs at the end of the branches and showed high conservation with previous reported plant NF-Ys. Evolutionary history analysis showed that only segmental duplication contributed to expansion of this gene family. Analysis of the 1500-bp regulatory regions upstream the start codon showed that, except for AhNF-YB6, peanut NF-Ys contained at least one abiotic stress response element in their regulatory region. Expression patterns of peanut NF-Ys in 22 tissues and developmental stages were analyzed. A few NF-Ys showed universal expression patterns, while most NF-Ys showed specific expression patterns. Through RNA-seq and qRT-PCR analyses, expression of six AhNF-Y genes was induced under salt stress in leaves or roots. In addition, AhNF-YA4/8/11, NF-YB4 and NF-YC2/8 also responded to osmotic stress, ABA (abscisic acid) and salicylic acid (SA) treatment.


Introduction
Nuclear factor Y (NF-Y) transcription factor, also known as heme activator protein (HAP) or CCAAT-binding factor (CBF), is present in almost all higher eukaryotic genomes. Eukaryotic NF-Ys consist of three subfamilies (NF-YA, NF-YB and NF-YC), and each NF-Y subunit is encoded by a single gene with multiple splicing isoforms in animals and yeast (Li et al. 1992). In contrast, each subunit of NF-Y is encoded by a family of approximately 10 genes in plants (Petroni et al. 2012). Eukaryotic NF-Ys function as a heterotrimeric complex to regulate the expression of downstream target genes (Nardini et al. 2013). NF-YB and NF-YC first dimerize in cytoplasm to create a molecular scaffold and then move into the nucleus to recruit an NF-YA component and subsequently bind to the CCAAT-box specifically (Myers et al. 2018). The NF-YB-YC-YA trimer can further recruit other TFs, such as CONSTANS (CO) or bZIP28 (Cao et al. 2014;Liu et al. 2010). In addition, other TFs, such as bZIP67, CO2 and VRN2, can also bind to the NF-YB-YC heterodimer competitively with NF-YA subunits (Li et al. 2011;Yamamoto et al. 2009). The function of NF-Ys in Arabidopsis thaliana, wheat, rice, maize, legume plants and tomato have been reported to regulate a variety of growth and developmental processes including male gametogenesis, embryogenesis, seed development and germination (Huang et al. 2015;Mu et al. 2013), photosynthesis and photomorphogenesis Stephenson et al. 2011), root growth (Ballif et al. 2011;Sorin et al. 2014;Zanetti et al. 2017), nodule formation (Bu et al. 2020;Hossain et al. 2016;Ripodas et al. 2019), flowering and yield regulation (Hwang et al. 2019;Muday et al. 2016;Shen et al. 2020;Su et al. 2018;Zhang et al. 2019), and fruit maturation .
Environmental stresses such as drought, salinity, and heat are the main challenges affecting the development, growth, and productivity of field crops, resulting in crop yield losses. A significant number of studies have revealed that NF-Ys are crucial regulators of plant stress response. AtNF-YA5, AtNF-YB1, GmNF-YA3 and PdNF-YB7 positively modulate drought tolerance of transgenic Arabidopsis (Chen et al. 2007;Han et al. 2013;Li et al. 2008;Ni et al. 2013), and OsNF-YA7 and ZmNF-YB2 confer drought tolerance to transgenic rice (Lee et al. 2015;Nelson et al. 2007). Overexpression of AtNF-YA1 and TaNF-YA10-1 significantly increased sensitivity to salt stress (Li et al. 2013;Ma et al. 2015a), and in constrast, SiNF-YA5 and GhNF-YA10/23 enhance the salt tolerance of plants . AtNF-YA2, AtNF-YC10, SlNF-YA9/10 were reported to participate in heat tolerance regulation in Arabidopsis and tomato (Rao et al. 2021;Sato et al. 2014). Moreover, some NF-Y subunits are involved in multiple stress responses. AtNF-YB2 and AtNF-YB3 positively regulate the heat stress response in Arabidopsis, but play negative roles in drought tolerance and exhibit functional redundancy in both processes (Sato et al. 2019). Rice OsHAP2E confers salinity and drought tolerance and increases photosynthesis and tiller number (Alam et al. 2015). Although some aspects of plant NF-Y research have advanced sufficiently to provide a mechanistic understanding, the regulatory mechanisms of most plant NF-Ys in stress response remain less well understood.
Peanut (Arachis hypogaea L.) is an important oil and food crop worldwide . More than half of the global peanut production comes from semiarid areas, where drought and soil salinity are the main limitations for peanut growth (Banavath et al. 2018). Cultivated peanut, which evolved from the hybridization and subsequent chromosome doubling of Arachis duranensis (A) and Arachis ipaensis (B), is an allotetraploid (AABB genome, 2n = 4x = 40) with a total genome size of approximately 2.7 Gb (Bertioli et al. 2016;Grabiele et al. 2012;Robledo et al. 2009). There is much less knowledge about the NF-Y TF involved in peanut stress response than the large genome. In the present study genome-wide identification and systematic analysis of the NF-Y gene family in cultivated peanut were performed. We identified the AhNF-Y gene family and analyzed their sequence features, phylogenetic relationships, chromosomal locations, gene duplication during the expansion. The expression profiles in 22 tissues and developmental stages and under salt stress were also investigated through reanalysis of published RNA-seq data. In addition, using quantitative real-time PCR (qRT-PCR), we analyzed the expression profiles of AhNF-Y genes under salt stress and identified several candidate genes responsive to abiotic stress and hormone treatment. The present results will facilitate future investigations on the functional characterization of NF-Y genes in peanut.

Identification of NF-Y Coding Genes in Peanut and Sequence Analysis
The genomic sequence of A. hypogaea cv. Tifrunner (Bertioli et al. 2019) and the annotated gene models were downloaded from PeanutBase (Version 1, https:// www. peanu tbase. org/ peanut_ genome). The NF-Y protein sequences of A. thaliana (Siefers et al. 2009) and Glycine max (Quach et al. 2015) used in this research (Additional file 4) were downloaded from the National Centre for Biotechnology Information (NCBI) database (http:// www. ncbi. nlm. nih. gov/). The hidden Markov model (HMM) profile of CBFB_NFYA (PF02045) was obtained from the Pfam protein family database (http:// pfam. xfam. org). The HMMER website server and BLASTP were used to search for NF-YA genes (p < 0.01) (Finn et al. 2015). The amino acid sequences of Arabidopsis, soybean and rice NF-Ys were used for BLAST of peanut NF-Y candidates. The putative AhNF-Y proteins were uploaded to the online Conserved Domain Database (https:// www. ncbi. nlm. nih. gov/ Struc ture/ bwrpsb/ bwrpsb. cgi) (Marchler-Bauer et al. 2017), Pfam and SMART (http:// smart. embl-heide lberg. de/) tools to verify the conserved NF-Y domain. DNAMAN software (LynnonBiosoft USA, version 6) was employed to perform multiple sequence alignments to determine whether the candidate genes could encode proteins carrying complete NF-Y subunit binding and DNA-binding domains. The ExPASy proteomics server (http:// prosi te. expasy. prg/) was used to acquire the sequence lengths, molecular weights and isoelectric points (Letunic et al. 2018;Robert et al. 2014). MEGA7 (Kumar et al. 2016) was used to perform sequence alignment and maximum-likelihood phylogenetic tree construction using the bootstrap method (number of bootstrap replications = 1000). iTOL (http:// itol. embl. de/) was used for visualization and further editing of the phylogenetic tree. The MEME program (http:// meme-suite. org/ tools/ meme) (Bailey et al. 2009) was employed to detect the conserved motifs in the full length of identified AhNF-Y proteins with the following parameters: the maximum number of motifs was 20 and the length range was 6-100 amino acids. Exon-intron structure visualization was performed by comparing cDNA sequences with their corresponding full-length genomic DNA sequences using the online tool Gene Structure Display Server version 2.0 (gsds. cbi. pku. edu. cn/) .

Chromosomal Distribution and Gene Duplication Analysis
The Multiple Collinearity Scan toolkit (MCScanX) (Wang et al. 2012) was employed to detect gene duplication events with the default parameters. The visualization of chromosomal distribution and synteny analysis were performed by Circos and Mapchart (Krzywinski et al. 2009;Voorrips 2002). The amino acid sequences of duplicated gene pairs were first aligned and used to guide the alignment of cDNA sequences with in-house Perl-scripts. KaKs calculator was used to compute the nonsynonymous (Ka) and synonymous (Ks) substitution values of each duplicated gene pair using the YN model. The divergence time (million years ago, Mya) was calculated with the formula T = Ks/2r. The r (rate of divergence for nuclear genes) was taken to be 1.5 × 10 -8 synonymous substitutions per site per year for dicotyledonous plants (Koch et al. 2000).

Analysis of Regulatory cis-elements Upstream of AhNF-Y Genes
The 1.5-kb upstream sequences of the initiation codon (ATG) of each AhNF-Y gene were submitted to the PlantCARE database (http:// bioin forma tics. psb. ugent. be/ webto ols/ plant care/ html/) to identify six regulatory elements (Lescot et al. 2002).

Plant Materials and Growth Conditions
The plant material used in this study, A. hypogea L. cultivar Fenghua 2 (Spanish type), was bred and preserved by our group. Mature seeds were germinated on distilled water-wet degreasing cotton extended in seedling cultivation disks. These disks were placed at 26 °C in darkness for 3 days and then exposed to long-day conditions (LD; 16 h light and 8 h dark cycle). Two-functional-leaf seedlings were transplanted to a hydroponic box and cultured with 1/5 Hoagland's nutrient solution (Pan et al. 2017).

Stress Treatment, Total RNA Extraction and Reverse Transcription
To analyze the expression pattern of AhNF-Y genes, twoweek-old seedlings were treated with nutrient solution containing 200 mM NaCl, 20% (w/v) mannitol, 100 mM abscisic acid (ABA) and 100 mM salicylic acid (SA), respectively. The leaves and roots of seedlings treated with NaCl were harvested at 0 and 16 h. Leaves of seedlings treated with mannitol were collected at 0, 2, 4, 6, 8, 12 and 24 h. For ABA and SA treatment, the leaves of seedlings were harvested at 0, 1, 2, 4, 6 and 8 h. All samples were frozen immediately in liquid nitrogen for RNA extraction. For each sample, leaves or roots from ten seedlings were harvested. Three independent biological replicates were performed for each treatment.
Total RNA was isolated with Quick RNA Isolation Kit (Waryong, Beijing, China) following the manufacturer's instructions. The concentration of the total RNA in each sample was quantified by a NanoDrop 2000 microvolume spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA). Reverse transcription was performed using Advantage RT-for-PCR Kit (TaKaRa, Dalian, China) according to the manufacturer's instructions. Total RNA and cDNA samples were stored at -80°Cand -20 °C, respectively.

Expression Profile Analysis of AhNF-Ys
The expression atlas of 22 A. hypogaea tissues was downloaded from PeanutBase (https:// www. peanu tbase. org/ gene_ expre ssion/ atlas) Dash et al. 2016). In these RNA-Seq data, the normalized reads were mapped to an in silico amphidiploid genome assembled from the genome of the diploid ancestor A. duranensis and A. ipaensis . BLAST was performed to identify homologous genes of AhNF-Y in the A. duranensis and A. ipaensis genomes. For each tetraploid peanut AhNF-Y, only when the A. duranensis or A. ipaensis NF-Y gene showed the highest similarity in amino acid sequence was it defined as the homologous gene. IDs of the homologous gene were used to extract the fragments per kilobase of transcript per million mapped reads (FPKM) values from the tissue expression atlas. Transcriptome data under salt stress were archived from the public repository of the NCBI (https:// www. ncbi. nlm. nih. gov) under BioProject accession number PRJNA560660 (unpublished data from part of another work of our group, BioSample SAMN12594512-14, SAMN12594518-20, SAMN12594524-26, SAMN12594530-32). The heatmap was created using TBtools (Chen et al. 2020) with log2transformed FPKM values, and row clustered.
SYBR Green real-time PCR was carried out using TB Green Premix Ex Taq (Tli RNaseH Plus, TaKaRa, Dalian, China) on a StepOne Plus system (Applied Biosystems, Waltham, USA) in a 20μL reaction volume according to the manufacturer's instructions. Three technical replicates were performed for each sample. The primers were designed using Beacon Designer 7.9. Actin was used as the internal reference gene. Sequences of the primers and actin are shown Table S1. The relative expression levels of the AhNF-Ys were evaluated by the 2 -ΔΔCt method. Statistical

Protein-protein Interaction Network Analysis
All differentially expressed genes (DEGs) encoding DNAbinding proteins detected under salt stress were screened from the online transcriptome data mentioned in "Expression profile analysis of AhNF-Ys". The interaction networks were created by the online software STRING (Version 11.0, https:// string-db. org/) using the amino acid sequences of DEGs and visualized by Cytoscape 3.7.1. The interaction database of A. thaliana and G. max were selected as references.

Identification and Phylogenetic Analysis of AhNF-Y Genes
In the allotetraploid peanut cultivar, 42 AhNF-Y candidates (19 AhNF-YAs, 14 AhNF-YBs and 9 AhNF-YCs) were identified. 33 of them (14 NF-YAs, 10 NF-YBs and 9 NF-YCs), which could encode NF-Y proteins with complete NF-Y subunit binding and DNA-binding domains, were defined as AhNF-Ys (Table 1, Fig. S1). All these genes were named according to chromosome name and chromosomal location (Fig. S2). Basic information on the AhNF-Y family members is listed in Table 1, including the IDs of the AhNF-Y-coding genes in A. hypogaea cv. Tifrunner genome (version 1), gene loci on the chromosome, length of the open reading frame (ORF), molecular weight (MW) and isoelectric point (pI). AhNF-YA12 was the largest protein with 492 amino acids and an MW of 55.45 kDa, while the smallest protein was AhNF-YB2 with 171 amino acids and an MW of 18.88 kDa. The pI ranged from 5.19 (AhNF-YC9) to 9.64 (AhNF-YA5).
To investigate the phylogenetic relationships of NF-Y genes in plants, phylogenetic tree was generated from 33 peanut NF-Ys (Table S2), 33 A. thaliana NF-Ys (AtNF-Ys) and 58 G. max NF-Ys (GmNF-Ys, Table S3). All the NF-Y proteins were clustered into three main clades, and each clade was equivalent to a single subfamily. AhNF-Y proteins tend to be present in pairs at the ends of branches, except AhNF-YC1, AhNF-YC2, and AhNF-YC8 (Fig. 1). Most of the pairs of peanut NF-Y genes were located at similar positions on the corresponding chromosome of the two subgenomes ( Fig. S4). However, there were a few exceptions in the NF-YC subfamily, including AhNF-YC2, AhNF-YC4, AhNF-YC5, AhNF-YC6, and AhNF-YC8. Furthermore, as shown in Fig. 1, each subfamily consisted of a few secondary clades (from clade I to clade XII). NF-Ys from all three species were clustered in clades II, IV, VI, VII, IX, and XII.
At the branch ends of clade I and XI, there were only NF-Ys from legume (soybean and peanut), while NF-Y proteins from peanut were not clustered in clades III, V, VIII, and X (gray strips in Fig. 1).
Some studies indicated that both segmental duplication and tandem duplication played important roles in the generation of gene families during evolution, as well as crop domestication (Cannon et al. 2004;Kondrashov 2012;Salman-Minkov et al. 2016). We analyzed the gene duplication events of AhNF-Y genes in the A. hypogaea genome (Fig. 2). Duplication events occur only within subgenomes. Two duplicated pairs were detected in the A and B subgenomes, respectively ( Table 2). The pairwise synonymous distances (Ks values) within the collinear blocks and divergence time of the segmental duplicated gene pairs were calculated ( Table 2). The divergence of NF-YB6 and NF-YB7 occurred much earlier than that of the other pairs, and all the duplication events of peanut NF-Y occurred before speciation of the two wild diploids at approximately 2 Mya ).

Gene Structure and Conserved Motif of AhNF-Ys
Phylogenetic analysis was performed for peanut NF-Y proteins, and most of the AhNF-Ys were clustered in pairs, except AhNF-YC4, AhNF-YC5 and AhNF-YC6 (Fig. 3A, Table S4). All the paired genes shared similar locations on the homologous chromosomes of the two subgenomes (Fig. S2). Furthermore, the gene structures of the paired genes showed a high degree of similarity, including the numbers and positions of both introns and exons (Fig. 3B). For example, there was an intron in the 5' untranslated region (UTR) of both AhNF-YA7 and AhNF-YA14. The members of the AhNF-YA subfamily contain a larger number of introns (from 4 to 7), while there are only one or two introns in the AhNF-YC subfamily, except in AhNF-YC1 (Fig. 3B).
The MEME analysis tool was used to predict the conserved motifs in AhNF-Y genes ( Fig. 3C and Table S5). A total of 20 motifs were identified. Motifs 7 and 8 were identified only in the AhNF-YA subfamily, and they consist the NF-YB/C interaction and DNA-binding domains of the NF-YA subunit (Fig. S1A) together with Motif 1. Motifs 2, 4 and 11 correspond to the NF-YA/NF-YC subunit interaction and DNA-binding domains of NF-YB protein ( Fig. S1B), which were unique to the AhNF-YB subfamily. Motifs 3, 6 and 10 existed only in the AhNF-YC subfamily and corresponded to the NF-YA/NF-YB subunit interaction and DNA-binding domains in the NF-YC subunit  (Fig. S1C). These results indicated the high conversion of the protein-protein and protein-DNA interaction structures of the NF-Y family in peanut, which is consistent with previous reports (Myers et al. 2018). In addition, some motifs were identified only in a few proteins; for example, motif 15 was observed only in NF-YA4/11 and NF-YA5/12, and motif 16 was identified only in NF-YB3/8 (Fig. 3C). The specificity of these motifs may result in functional differences between NF-Y subunits within each subfamily.

Tissue/organ-specific Expression Analysis of AhNF-Ys
To determine the expression patterns of AhNF-Y genes, we used published RNA-seq data, which covered 22 tissues throughout the life cycle of peanut ).
Genes with the most similar expression patterns were clustered together by TBtools, and all the AhNF-Ys were classified into four main groups (Fig. 4, Table S6). Group 1 (G1) contained 3 genes (AhNF-YA3, A10 and C6), which showed low expression levels in most tissues except seed. Group 2 (G2) comprised 8 genes. The expression of these genes was hardly detected in most tissues/organs except a few, such as AhNA-YB1/6 in seeds. The third group (G3) included only 2 genes (AhNF-YC1 and C8), with extremely high expression levels throughout the life cycle and nearly all organs/tissues. Group 4 (G4) was composed of the other AhNF-Ys. These genes showed higher expression levels than genes belonging to group 1 and group 2 and displayed diverse expression patterns. For example, AhNF-YB4, B9 and C3 were expressed in all the examined tissues and developmental stages, but the expression levels are quite different; AhNF-YB2 and B7 were highly expressed in nodules only. The above results indicated that AhNF-Y genes showed diverse expression patterns in peanut.
In contrast, 11 other AhNF-Ys (AhNF-YA2, A5, A12, B3, B5, B8, B10, C3, C4, C6 and C9) could not be induced by salt stress. In addition, the expression of AhNF-YA10 and AhNF-YB1 was not detected. This distinction in the expression patterns indicated that members of the AhNF-Y family might have different responses and regulatory mechanisms under salt stress.
To further validate the RNA-seq data, the expression of 6 typical AhNF-Y genes (AhNF-YA4, A8, A11, B4, C2 and C8) in both leaves and roots was analyzed by qRT-PCR ( Fig. 5B and C). After 16 h of salt treatment (200 mM NaCl), the expression levels of AhNF-YA4/A8/C8 were upregulated in both leaves and roots, while AhNF-YC2 was induced only in leaves. These results are consistent with the RNA-seq data. In addition, the expression pattern of AhNF-YA11 in leaves was not affected by salt stress, and the expression levels of AhNF-YB4 in leaves and roots showed no significant change. The expression patterns of AhNF-YA11 and AhNF-YB4 were inconsistent with the RNA-seq data. The expression of the other 6 genes under salt stress was also detected by qRT-PCR (Fig. S4A). Correlation analysis between the RNA-seq and the qRT-PCR result of the 12 genes were performed, and the Pearson's coefficient was 0.8375 (Fig. S4B).
It has been reported that plant NF-Y proteins perform their functions by recruiting other TFs to specific promoter sequences. For example, Arabidopsis NF-YC3 could interact with ABF3/4 to promote flowering by inducing the transcription of SOC1 under drought stress (Hwang et al. ). In addition, the CO protein could even interact with NF-YB2-YC9 dimer instead of the NF-YA subunit to binding the CORE element of the FLOWERING LOCUS T promoter (Gnesutta et al. 2017). To further obtain an overview of the transcriptional regulation function of AhNF-Ys in peanut under salt stress, we established a protein-protein interaction (PPI) network among the salt stress-induced DNA-binding protein encoding genes in peanut (Table S8). As shown in Fig. 5D, peanut NF-Ys could interact with NF-Y proteins belonging to other two subfamilies (red node). In addition to the AhNF-YA and AhNF-YC subunits, the predicted AhNF-YB1 could interact with other kinds of TFs directly, including a bZIP TF (arahy.63C1KK), AP2 TFs (arahy.4K8YXT, arahy.F0M2KT), a MADS-box TF (arahy.XIQ273), and plant hormone-responsive TFs (arahy.HP6LSM, arahy. HB4W2S). The above TFs further interact with TFs involve in plant hormone response, stress response, flowering regulation, circadian clock and so on (Fig. 5D, Table S9).

Abiotic Stress-related Response of AhNF-Ys
To investigate the potential regulatory mechanisms of AhNF-Ys in the abiotic stress response, the sequences 1.5 kb upstream from the initiation codon of the AhNF-Y genes were detected using the PlantCARE database to identify regulatory elements. Six stress-related regulatory elements, including ABA response element (ABRE), CGTCA-motif, MYB binding site (MBS), TCA-element, TC-rich repeat and TGACG-motif, are shown in Fig. S3. ABRE elements were detected in promoter regions of 18 AhNF-Ys. The methyl jasmonate (MeJA)-responsive elements CGTCA-motif and TGACG-motif were both detected in 22 AhNF-Ys. A total of 12 AhNF-Ys contained TCA-elements (SA responsiveness). In addition, MBS (drought responsiveness) and TCrich repeats (defence and stress responsiveness) were found in 7 and 10 AhNF-Ys, respectively. At least one regulatory element was identified in the promoter region of the NF-Y genes, except in that of AhNF-YB6. These results suggested that AhNF-Y genes may be involved in many different abiotic stresses.
To further investigate whether the expression of these predicted AhNF-Y genes was influenced by other stress treatments (mannitol), ABA and SA, qRT-PCR was used to survey the transcript levels in leaves (Fig. 6). The results revealed that the transcript levels of AhNF-YA4 and AhNF-YA8 were downregulated, and both reached the lowest levels under osmotic stress at approximately 8 h. In contrast, AhNF-YA11, AhNF-YC2 and AhNF-YC8 had similar expression profiles and showed a trend toward upregulation. Under the same treatment, the expression level of AhNF-YB4 was upregulated until approximately 12 h. The transcript levels of all 6 predicted AhNF-Y genes were increased under ABA treatment and peaked at approximately 2 h, except for those of AhNF-YB4. In addition, all these genes responded significantly to SA treatment. All the above results indicated that these 6 AhNF-Y genes responded to osmotic stress, ABA and SA with distinct expression patterns.

Discussion
The NF-Y TF family is widely distributed in eukaryotes and plays important roles in development and stress responses. Studies have shown that the protein-protein interaction domains of NF-Y proteins are highly similar within each subfamily (Laloum et al. 2013). Therefore, the NF-Y protein could theoretically interact with members from other subfamilies, and these interactions have been proven using yeast two-hybrid assay (Hackenberg et al. 2012). For example, there are at least 10 genes encoding each subunit type in Arabidopsis; therefore, up to 1690 theoretical NF-Y complexes can be formed (Siefers et al. 2009). This amplification creates a flexible formation, leading to new and divergent functions (Myers et al. 2018;Petroni et al. 2012). The complex interactions within NF-Y family, as well as between NF-Ys and other TFs, make functional analysis difficult. In addition, the overlapping functionality between NF-Y subunits resulted in inefficiency of the forward genetic mutant screen. Therefore, the function of more than one-third of NF-Y subunits has not been reported, and only a few NF-Y complexes have been described (Zhao et al. 2016). Before further study of the biological function of the NF-Y protein in peanut, it is necessary to obtain an overview of the NF-Y Fig. 6 Expression analysis of 6 AhNF-Y genes in response to mannitol A, ABA B and SA C in peanut leaves. Bars reflect the means ± SDs of three replicates. Asterisks indicate the corresponding gene significantly up-or downregulated compared with the untreated control (**P < 0.01 and *P < 0.05, Student's t-test) family, including the gene structures, homology between members, and expression patterns in tissues and under stress.
In this study, we identified 33 NF-Y-coding genes throughout the peanut genome. The number of AhNF-Ys is similar to that in diploid plants such as Arabidopsis and rice (Siefers et al. 2009;Thirumurugan et al. 2008). Peanut is an allotetraploid crop with a genome size of approximately 2.7 Gb. Compared with crops, such as soybean (Shen et al. 2018), maize ) and cotton , peanut has fewer NF-Ys Quach et al. 2015;Zhang et al. 2020Zhang et al. , 2016. Similar situations have been reported in other gene families of peanut as well. For example, although cotton (Gossypium hirsutum, more than 60 K protein coding genes) and peanut are both tetraploid crops, there are 306 NAC TFs in cotton and only 162 in allotetraploid peanut (Mohanta et al. 2020;Wang et al. 2019). Furthermore, the numbers of Hsf genes in soybean and cotton were 38 and 40 (Li et al. 2014;Wang et al. 2014), which are further more than the total number of Hsf genes in the diploid ancestors of peanut . Gene duplication events are the main reason of gene family expansion. The genome of both diploid ancestors of the peanut cultivar underwent the early papilionoid whole-genome duplication (WGD) approximately 58 Mya (Bertioli et al. 2016), and WGD events occurred after tetraploidization have not been reported yet. Additionally, divergence time of the four duplication pairs identified in peanut NF-Y family were 36.731, 31.721, 37.689 and 67.689 Mya respectively (Table 2), which were long before speciation and hybridization of the two diploids ancestors reported Zhuang et al. 2019). Therefore, number of NF-Y coding genes in allotetraploid peanut may mainly determine by those of the two diploid ancestors. BLASTP analysis of the NF-Y protein coding genes showed that, the total number of NF-Y in the two diploid ancestors is similar with that of allotetraploid peanut (Table S10), which is consistent with the above prediction. It is reported that, a new allele of the duplicated genes has a small probability to be fixed in a diploid population compared with pre-existing alleles (Cagliari et al. 2011). According to the phylogenetic analysis (Fig. 1), AhNF-Ys were missing on several branches and only GmNF-Y and AtNF-Y were presented. However, according to current knowledge about diploid and allotetraploid peanut genome, it is difficult to determine whether there were duplications not fixed by natural selection or massive gene lose in diploid ancestors. In conclusion, it can be inferred that the fewer number of peanut NF-Ys may due to the origin and evolution of NF-Ys in two diploid ancestors, and it will be an interesting area for further study.
There were relatively few NF-Y coding genes in peanut, and most of the NF-Ys existed as homologous gene pairs possibly indicating low functional redundancy among nonhomologs within each subfamily. Homologs from different subgenomes usually share similar gene structures and Fig. 7 Schematic diagram of the probable function of peanut AhNF-Ys. According to the tissue/organ expression patterns and the qRT-PCR data under stress, the probable biological processes peanut AhNF-Ys involved in were summarized. AhNF-YC1 and AhNF-YC8 showed relatively high expression level, but low differences between tissues/organs, therefore they are not included in this diagram

AhNF-YA4/8/11
AhNF-YB4 AhNF-YC2/8 MEME motifs (Fig. 3). Some MEME motifs are conserved at the subfamily level, and they usually correspond to the DNA-binding and NF-Y interaction domains of NF-Y proteins. The MEME motifs that did not exhibit subfamilylevel conservation will provide clues for NF-Y functional analysis. We performed row clustering for heatmap analysis, and the homologs showed high similarity in expression. There were 12 homologous pairs in the NF-YA and NF-YB subfamilies (Fig. 3A), seven of which (NF-YA4/11, NF-YA1/8, NF-YA7/14, NF-YA2/9, NF-YB5/10, NF-YB4/9 and NF-YB2/7) showed the same tissue and salt stress-induced expression pattern (Figs. 4 and 5A) and the same MEME motifs within pairs (Fig. 3A). These pairs may play similar roles in most conditions; for example, AhNF-YA4 and AhNF-YA11 represent similar expression patterns under ABA and SA treatment (Fig. 6B, C). It has been reported that even if two NF-Y genes share very high sequence similarity and functional redundancy, some functional divergence remains (Sato et al. 2019). Under osmotic stress, the expression pattern of AhNF-YA4 and AhNF-YA11 were different (Fig. 6A). Although some homologous pairs carried the same MEME motifs, they presented different expression patterns, such as NF-YA6/13, NF-YA3/10, NF-YB3/8 and NF-YB1/6. This difference may be due to the cis-elements in the promoter region (Fig. S3). AhNF-YA5 and AhNF-YA12 are homologs carrying different MEME motifs. There is a copy of motif 4 in NF-YA5, while motif 1 is present at a position similar to that in NF-YA12 (Fig. 3). These difference in promoter or function-unknown motifs may result in functional diversity of the NF-YA and NF-YB homologs. The NF-YC subfamily showed relatively low similarity in protein structure (Fig. 3) and expression in tissues (Fig. 4), which indicated that there may be low functional redundancy between NF-YCs and that NF-YC subunits may be the important determinants of NF-Y complex function in peanut.
According to the tissue/organ expression patterns and the qRT-PCR analysis under stress, the probable biological processes peanut AhNF-Ys involved in were summarized (Fig. 7). AhNF-Ys involve in both vegetative and reproductive, including shoot tip and root growth, flower and seed development. In addition, AhAF-Ys also play roles in abiotic stress response, disease resistance (SA-responsive genes) and nodule formation. Their functions in salt stress response were focused. Overexpression of NF-YA and NF-YB subunits could enhance the salt tolerance of Arabidopsis and Paspalum vaginatum O. Swartz (Li et al. 2013;Ma et al. 2015b;Wu et al. 2018). However, the regulatory mechanisms, especially the interaction proteins and downstream targets, have rarely been reported. In this study, we established a protein-protein interaction network including differentially expressed TFs under salt stress (Fig. 5D). In this network, AhNF-Ys could interact with multiple TFs and participate in several biological processes. These results will help in functional studies of plant NF-Ys under salt stress. NF-Ys in Medicago truncatula, Phaseolus vulgaris, Lotus japonicus and Parasponia andersonii play important roles in nodule formation, including PvNF-YA1, LjNF-YA1, MtNF-YA1, PanNF-YA1, PvNF-YB7 and PvNF-YC1 (Bu et al. 2020;Combier et al. 2006;Hossain et al. 2016;Laloum et al. 2014;Ripodas et al. 2019;Soyano et al. 2013;Zanetti et al. 2010). According to phylogenetic analysis (Fig. S5), their homologs genes in peanut, namely AhNF-YA1/7/14, AhNF-YB2/7, and AhNF-YC1/7 showed relatively high expression levels in nodules, particularly AhNF-YB2/7 (Fig. 4, Fig. 7). These proteins may participate in processes associated with peanut-rhizobia symbiosis in the form of NF-Y trimers or NF-Y-TF complexes.
In conclusion, here, we provide an overview of peanut NF-Y genes. The information described here will help in further investigation of the plant NF-Y gene family, especially in the context of salt stress response regulation and symbiosis with rhizobia.

Conflict of interest
The authors declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.