Genome‐wide identification of mlo genes in the cultivated peanut (Arachis hypogaea L.)

Powdery mildew disease caused by Oidium arachidis poses a threat to peanut production in Africa. Loss of function mutants of specific Mlo (Mildew Locus O) genes have provided broad-spectrum and durable resistance against pathogen in many crop species. Since there is huge potential to utilize susceptibility gene-mediated resistance in crop improvement, genome-wide mining of susceptibility genes is required for further researches. However, the susceptibility genes have not been characterized in peanut genome. In this research study, the genome of the cultivated peanut was used as reference to identify the AhMlo loci. Our results revealed that 25 AhMlo loci were identified and distributed on the chromosomes of the cultivated peanut. Eleven AhMlo loci were located on the A-genome while the remaining 14 on the B-genome. Variable number of inserted intron sequences (4–14) and transmembrane helix (4–8) were observed in the coding sequence of the AhMlo loci. Furthermore, phylogenetic analysis of the AhMlo loci along with homologs from other species has clustered the AhMlo loci into six clades. Three AhMlo loci were clustered in the clade V known to regroup the powdery susceptibility loci in dicots. Additionally, four core promoters were predicted on the promoter region of the specific AhMlo along with cis-regulatory elements related to PM susceptibility. These results provided strong evidence of the identification and distribution of the Mlo loci in the cultivated peanut genome and the identified specific AhMlo loci can be used for loss of susceptibility study.


Introduction
The resource of disease resistance genes (R) is essential prerequisite for development of resistant varieties in plant. Much efforts have been made to exploit, clone, and transfer R genes in conventional resistant breeding programs. R genes encode proteins that recognize pathogen effectors to generate effectortriggered immunity system. The R-gene mediated disease resistance incorporated into elite cultivars provides an effective and eco-friendly approach for sustainable agriculture. Although R genes play a key role in effectively combating a diversity of pathogens and pests, they tend to have a narrow spectrum of pathogen recognition due to monoculture systems in modern agriculture (Langner et al. 2018). In addition, breeding of resistant varieties via introgression of R genes is time consuming, especially when R genes are linked to deleterious genes, requiring more generations to break the linkage drag (Berg et al. 2015;Yin and Qiu 2019).
Complementary to R genes, the exploitation of susceptibility genes (S) is an alternative source for disease resistance. S genes confer proteins that facilitate pathogen growth and suppress immune responses, or they can act as negative regulators of immunity (Pavan et al. 2010;Langner et al. 2018). Loss of function mutations in an S gene leads to durable, broad-spectrum, recessively inherited resistance. Pathogenesis is terminated by impaired S gene that fails to produce substances for pathogen growth. Identification of S genes and induction of mutants in S genes is becoming popular strategy for resistant breeding (Kusch and Panstruga 2017;Borrelli et al. 2018).
Powdery mildew (PM) is a widespread plant disease caused by obligate biotrophic fungi. As an airborne disease, it spreads easily and quickly, threating crop plants and causing significant reduction in crop yield (Acevedo-Garcia et al. 2014;Murube et al. 2017). Mildew locus O (Mlo) genes are functionally negative regulated vesicle-associated actin-dependent defense pathways at the site of PM penetration (Pessina et al. 2016). Mlo genes encode proteins with seven transmembrane domains and a calmodulinbinding site and are conserved in monocots and dicots (Appiano et al. 2015) and have been phylogenetically divided into seven clades (Loieno et al. 2015). Monocot Mlo proteins in the clade IV were associated with PM susceptibility while dicot proteins sorted in the clade V (Polanco et al. 2018).
Powdery mildew disease caused by Oidium arachidis (Chorin) in peanut leads to a yield loss between 13 and 50% in Africa (Middleton et al. 1994). The sub-Saharan area is the largest peanut production region in Africa, including Nigeria producing 30% of total production in Africa, followed by Senegal at 8% and Ghana at 5% (OECD/FAO 2016). To reduce the loss of yield, fungicide application is the strategy to control powdery mildew disease. However, it could result in a negative impact on the environment and human health as well as increasing farming costs. The rapid evolution of fungal strains that can develop resistance to these fungicides is another worrisome aspect. Enhancing host resistance is the best sustainable approach to control the PM disease and beneficial to farmers and the entire peanut industry in Africa. The information on the resource of resistance to PM disease is limited in peanut. Therefore, the aim of this study was to localize all Mlo genes in the peanut genome and understand their gene structures and to identify specific Mlo genes that confer powdery mildew susceptibility. Identification of these specific Mlo susceptible genes would facilitate induction of mutagenesis for the loss of function through biotechnology, such as genome editing, which would prevent pathogen infection.

Materials and methods
Isolation and validation of the mlo genes in the cultivated peanut Mlo DNA sequences were downloaded from the peanut genome assembly v1 for Arachis hypogaea var Tifrunner (http://www.peanutbase.org/gbrowse_ peanut1.0). All sequences were used as query for amino acid sequence search using BLASTx tool from the NCBI database (http://www.ncbi.nlm.nih.gov). Prediction of TM helices of the putative AhMlo proteins was performed using TMHMM server 2.0 (http:// cbs.dtu.dk/services/TMHMM).
Phylogenetic and gene structure analysis Peanut Mlo protein sequences were aligned with the protein sequences of Mlo homologs from several legume species along with those from Arabidopsis and barley using the ClustalW method in MEGA7 (Kumar et al. 2016). The aligned sequences were used to generate a phylogenetic tree using neighbor-joining method with 1000 times bootstrap replicates. All DNA sequences were also used for gene finding using soybean as genome-specific parameters in the software FGENESH-M (http://linux1.softberry.com/ berry.phtml?topic=fgenesh%26group=programs% 26subgroup=gfind). Gene structure was visualized using Gene Structure Display Server GSDS2.0 (http:// gsds.cbi.pku.edu.cn) (Hu et al. 2015).

Characterization of Mlo proteins
All exons identified from AhMlo DNA sequences belonged to the Clade V were subjected proteinprotein BLAST (blastp) against NBCI protein datasets and those exons were aligned with their corresponding Mlo-like protein sequences to predict their locations in the protein structure. The upstream sequences (3 kb) of AhMlo in the clade V were predicted for core promoters from the link of BDGP (http://www.fruitfly. org/seq_tools/promoter.html) and identified for cisacting regulatory elements (CRE) from New PLACE (Higo et al. 1999, https://www.dna.affrc.go.jp/ PLACE/?action=newplace).

Identification and distribution of AhMlo loci in peanut
To identify Mlo loci in the cultivated peanut, Mlo DNA sequences were searched from the reference genome of A. hypogaea. There were 107 homologous sequences related to Mlo. Among these sequences, some were at the same location but with different length of sequences, while some were same sequences but at different locations. After removed redundant sequences, a total of 34 Mlo locus sequences were used for analysis and 25 sequences were identified significantly homologous to Mlo-like protein by BLASTx searching against NCBI protein datasets. The predicted AhMlo loci were distributed on 14 chromosomes across the cultivated peanut genome (Supplemental file S1) and numbered 1-25 based on their chromosomal location (Fig. 1). These 25 AhMlo loci were hit to 7 Mlo-like proteins, like as AhMlo2,7,9,18,20,23,AhMlo4,6,12,14 and 17 to Mlo4,AhMlo10,16,and 21 to Mlo6,AhMlo3,8,13 and 19 to Mlo9, AhMlo1 and 25 to Mlo10, AhMlo11 and 22 to Mlo11, and AhMlo5 and 15 to Mlo13 (Table 1). Five AhMlo loci were located on the chromosome 8, three loci on chromosomes 14 and 17, two loci on chromosomes 4, 18 and 19, only one loci in each of the remaining chromosomes, and no Mlo loci were observed on chromosomes 6, 7, 9, 10, 12, and 16. Among the predicted loci, AhMlo7 and 8 clustered on chromosome 8 were separated by 30 kb without significant sequence similarity between two loci. Similarly, AhMlo18 and 19 clustered on chromosome 17 did not show sequence similarity although the distance between them was less than 60 kb. This indicated these clustered loci are not derived from tandem duplication events. Considering the A and B genomes, 11 AhMlo loci were identified from the A genome (chromosomes 1-10) while 14 loci belong to the B genome (chromosomes 11-20).
Cluster analysis of 25 AhMlo loci sequences resulted in two groups, in which corresponding MLO-like proteins 4 and 13 were presented only in one group and proteins 6, 10 and 11 formed in another group, while proteins 1 and 9 were included in both groups (Fig. 2). The TM helices detected using TMHMM server 2.0 showed that the number of TM domains varied from 5 in AhMlo9 to 8 in AhMlo1 and 25. Most Mlo-like proteins (17) had seven number of characteristic TM domains with three extracellular loops and three intracellular loops. There were four pairs of AhMlo locus that each pair corresponded to the same Mlo-like protein and the same number of TM domains was located in the corresponding pair of chromosomes of genome A and B. For instance, AhMlo3-AhMlo13 were located on chromosomes A3 and B3; AhMlo5-AhMlo15 on chromosomes A4 and B4; AhMlo6-AhMlo17 on chromosomes A5 and B5; AhMlo11-AhMlo22 on chromosomes A8 and B8, respectively, indicating they are homologous loci. Although AhMlo9 and AhMlo24 corresponded to the same protein as Mlo1, they possessed different number of TM domains and were located on chromosomes A8 and B9, suggesting they might be paralogous loci.
Gene structure and phylogenetic analysis All AhMlo locus sequences were subjected for gene finding using FGENESH-M and GSDS tools. Gene structure analysis showed the structure differences among AhMlo loci as depicted in Fig. 3. The number of exons varied from 4 (AhMlo17) to 14 (AhMlo7), while the majority (60%) of AhMlo loci harbored 9-13 exons. All AhMlo loci carried introns with a range of 4  (AhMlo17) to 14 (AhMlo24), and 25% of AhMlo loci retained 13 introns.
In order to predict specific AhMlo loci responsible for PM susceptibility, we performed a phylogenetic analysis using peanut Mlo-like protein sequences along with Mlo homologs of 6 legume species   multiple sequence alignment of Mlo proteins showed the presence of seven clades, in which seven peanut Mlo-like protein sequences were grouped in clade I and II, two in clade IV and VI, three in clade V, and non in clade VII (Fig. 4). As expected, most Mlo protein sequences from other species responsible for PM susceptibility were clustered in clade V, including AtMlo2, 6, 12, MtMlo1, GmMlo6, 12, CaMlo2, 6, CcMlo1, 3, VrMlo1, 2, 5, and PvMlo6, 12.
Characterization of peanut MLO proteins related to PM susceptibility Three AhMlo loci (AhMlo10, 16, 21) in the clade V were homologous to Mlo-like protein 6 in peanut, where AhMlo16 on chromosome B4 was homologous to XP_025647050 while AhMlo 10 and 21 on chromosomes A8 and B8 to XP_025616322. The protein sequences of XP_025647050 and XP_025616322 shared only 52.2% of identity, suggesting that they are different isoforms for Mlo6 protein. In order to ascertain the important regions responsible for susceptibility in the Mlo protein, the exons of each locus were aligned with the Mlo6-like protein sequence to determine their locations on the structure of Mlo6 protein. In this study, AhMlo16 contained 5 exons and four of them displayed significant similarity with Mlo6, while the fifth exon did not show homolog with any protein sequences in the NCBI protein datasets, indicating this locus may be truncated and be not used in the consequent analysis. Notably, 5 out of 10 exons in AhMlo10 and 5 from 7 exons in AhMlo21 were exactly same and identical to the protein sequences at different regions on the Mlo6 (XP_025616322). These five exons were located on extracellular loop 1, and intracellular loop 1 and 2. Other 5 exons from AhMlo10 was pinned on intracellular loop 3, and calmodulin binding region, while the remaining 2 exons in AhMlo21 located on intracellular loop 2 (Fig. 5).
Loss of susceptibility can be accomplished by disrupting the coding region of susceptibility gene, Fig. 4 Phylogenetic tree analysis of the AhMlo along with MLO homologs from other crop species also can be achieved through unique motif and regulatory element that mediate gene expression responsible for the susceptibility. To identify regulatory elements in the promoter region, 3,000 base sequence upstream of each locus were downloaded as putative promoter region. Five core promoters were predicted in AhMlo10 and four in AhMlo21. The third and fourth core promoters in AhMlo21 was identical with the second and third core promoters in AhMlo10 with single SNP (A/T) and (T/G) variation, respectively. Identification of cis-acting regulatory elements (CREs) showed 101 CREs in AhMlo10 and 110 CREs in AhMlo21 (Table S1). Comparison of CREs between two loci suggested that the most abundant CREs were CAATBOX1 (ID:S000028), CACTFTPPCA1 (ID:S000449), and DFCOREZM (ID:S000265) in both putative promoter regions, which was consistent with the observation from other plant species (Andolfo et al. 2019). These CREs were recorded 39 and 56, 21 and 33, 21 and 30 times out of total of CREs in AhMlo10 and 21, respectively. In addition, we observed TC-box and Thymine rich motifs located at distal promoter of both loci (over 1.5 kb upstream of the transcriptional start site), which were presumed for PM susceptibility in different plant species (Shen et al. 2012;Andolfo et al. 2019).

Discussion
AhMlo loci distribution in the cultivated peanut

Identification and characterization of specific mlo for PM susceptibility
The specific Mlo loci related to PM susceptibility function as a negative regulator of hypersensitive response upon PM infection in plant. Loss of function mutations in specific Mlo genes confer a broadspectrum and durable disease resistance in various crops. Thus, genome-wide identification of susceptibility genes is a prerequisite for loss of function studies. In this study, two AhMlo loci were clustered with Mlo gene members from other plant species responsible for the susceptibility in the clade V. These two AhMlo loci were located on chromosome A8 and B8, respectively, as being homologous to the same Mlo-like protein 6, indicating they are homeologous loci. By aligning the exons identified from two AhMlo homeologous loci with Mlo6 protein, five exons were same in both loci and pinpointed in extracellular loop 1, intracellular loop 1 and 2. They might be the intracellular domains responsible for the PM susceptibility. However, AhMlo21 may have gone through an intragenic recombination event after the tetraploidization resulting in two additional exons in the intron region and the missing of five exons as identified in AhMlo10. Such recombination can result in deletion, duplication, inversion and fusion leading to a loss of function of the gene. A further function study by induction of mutations in AhMlo21 will validate whether AhMlo21 is a pseudogene. Nevertheless, each of these exons can be a target for disruption of these AhMlo loci to provide the basis for functional studies in peanut. Reinstadler et al. (2010) pointed out that the second and third cytoplasmic loops of Mlo protein are critical for the susceptibility-conferring activity of the Mlo protein in barley. The calmodulin binding region was identified as an important Ca 2? signal response motif that mediates defense reactions against powdery mildew (Kim et al. 2002). All these domains can be useful in the loss of susceptibility for the resistant breeding.
Many research studies have focused on the development of resistant lines for resistance to PM disease by targeting the Mlo homologs in the clade IV and V for monocots and dicots respectively. However, loss of function of all homologs in clade IV and V did not result in resistance to the PM disease. For instance, three tomato SlMLO were identified in clade V, but only one MLO (SlMLO1) line was resistant to PM disease by the loss of function (Andolfo et al. 2019;Zheng et al. 2016). Furthermore, four grapevine MLO (VvMLO6, 7, 11 and 13) were identified in clade V, however, silencing VvMLO7 in combination with 6 and 11 reduced PM severity exception of VvMLO13 (Pessina et al. 2016). Two AhMlo were grouped in clade V in this study; therefore, these two AhMlo loci can be targeted for the loss of function study. By analyzing the level of expression of these AhMlo can determine which AhMlo is playing an important function in PM susceptibility upon pathogen infection.
Even though the molecular mechanisms modulating the expression of MLO susceptibility genes are not well understood, by using ''In Silico'' study, Andolfo et al. (2019) have shown the conservation of two motifs among putative promoter regions of the MLO homologs of clade V related to PM susceptibility (Andolfo et al. 2019). In this study, in addition to the coding regions, defense responsive cis-regulatory elements, TC-box and Thymine rich, were identified in the promoter region of the two AhMlo. These defense related CREs can be alternative targets for manipulation of gene expression. CRISPR/Cas9 approaches can be used to induce mutations in the coding and promoter regions to interfere with the compatibility between the host and the pathogens and provide durable disease resistance (Zaidi et al. 2018). For instance, disruption of promoter and coding regions of SWEET genes that encode sugar transporters using TALENs and CRISPR/Cas9 system has provided enhanced resistance to rice bacterial blight (Blanvillain-Baufume et al. 2017). Mutations induced in Mlo gene in tomato and enhanced disease resistance 1 (EDR1) in wheat, another PM-susceptibility gene, using CRISPR/Cas9 resulted in significant reduction of PM (Nekrasov et al. 2017;Zhang et al. 2017). The findings in this study provide a useful information on Mlo genes in peanut and suggest potential utilization of susceptibility genes for resistant breeding in peanut.

Conclusions
Susceptibility genes are favorite for pathogen growth by providing substances to pathogen or acting as negative regulators of immunity. Disruption of susceptibility genes can improve plant capability of resistance to pathogens as demonstrated in many plant species using specific Mlo genes. Therefore, susceptibility gene-medicated resistance is supplemental benefits in plant resistance breeding. The information on Mlo genes in peanut, especially specific Mlo genes for susceptibility of powdery mildew, is limited. In this study, 25 AhMlo loci were identified, two of which were putative specific genes for the susceptibility that can be used for loss of susceptibility study and resistance breeding in peanut. Shen Q, Zhao JM, Du CF, Xiang Y, Cao JX, Qin XR (2012) Genome-scale identification of MLO domain-containing genes in soybean (Glycine max L. Merr.). Genes Genet Syst 87:89-98 Wolter M, Hollricher K, Salamini F, Schulze-Lefert P (1993) The mlo resistance alleles to powdery mildew infection in barley trigger a developmentally controlled defence mimic phenotype. Mol Gen Genet 239:122-128 Yin KQ, Qiu JL (2019)  Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.