Molecular cloning and characterization of five genes from embryogenic callus in Miscanthus lutarioriparius

The regeneration from embryogenic callus of higher plants in tissue culture is regulated by explants types and developmental stage and also regulated by some genes. In Miscanthus lutarioriparius, five candidate genes were selected to decide the differential expression between embryogenic and non-embryogenic calli, including MlARF-GEP (guanine nucleotide-exchange protein of ADP ribosylation factor), MlKHCP (kinesin heavy chain like protein), MlSERK1 (somatic embryogenesis receptor-like kinases 1), MlSERK2 (somatic embryogenesis reportor-like kinases 2), and MlTypA (tyrosine phosphorylation protein A) with Genbank accession numbers KU640196–KU640200. Multiple sequence alignment analysis showed that five genes were highly conserved among members of their gene families respectively. Phylogenetic relationship analysis showed that five genes were closest with homologous genes of Zea mays and Sorghum. The qRT-PCR results showed significant differences of five genes expression pattern between two different callus types, the relative expression in embryogenic callus was detected to exceed in non-embryogenic callus. Furthermore, simple sequence repeats (SSR) marker statistics results via Chi-square showed a significant correlation between MlSERK1 genotype and induction of embryogenic callus in M. lutarioriparius. This study may lay the foundation of the molecular mechanism on the embryogenic callus induction of M. lutarioriparius and perhaps provide some gist for further study on genetic manipulation.


Introduction
The establishment of the in vitro system through embryogenic callus of higher plants is a bottleneck step for production and genetic breeding. The solution of the embryogenic callus induction depends on the cell totipotent, and some of the plant cells may become totipotent when they treated with proper conditions (Fehér 2019). With the deepening research, early in twentieth century somatic embryogenesis was achieved in Arabidopsis thaliana (Mordhorst et al. 1998), Oryza sativa L. (Karthikeyan et al. 2009), Zea mays L. (Armstrong and Green 1985;Vasil and Vasil 1986), Hordeum vulgare L. (Pasternak et al. 1999), Gossypium hirsutum (Zheng et al. 2014), as well as in two Miscanthus species including Miscanthus sinensis and Miscanthus × giganteus (Gawel et al. 1990;Holme and Petersen 1996). There are probably several reasons affect the Miscanthus × giganteus regeneration including explant types, developmental stages, and culture media (Holme and Petersen 1996). From previous study, somatic embryogenesis had been accomplished in a number of plants with different hormone concentrations or culture conditions; however, the molecular mechanism has not been deciphered clearly. To elucidate the molecular mechanism, the additional research of somatic embryogenesis should be carried out.
Communicated by Q. Wang.
1 3 89 Page 2 of 9 It was reported that some genes and transcriptional factors played key roles in somatic embryogenesis formation (Méndez-Hernández et al. 2019;Namasivayam 2007). Among those genes, the somatic embryogenesis receptor kinase (SERK) genes must be mentioned first, which was recognized to be the most important one, involving in a series of developmental processes such as induction and differentiation of callus and cell totipotency in plant (Méndez-Hernández et al. 2019;Pilarska et al. 2016). SERK had been first considered to be an important marker to identify embryogenic cells in carrot (Schmidt et al. 1997). In Arabidopsis thaliana, the loss of SERK was demonstrated to result in abnormal embryo development (Li et al. 2019). The homolog genes of SERK, associated with somatic embryogenesis, have also been detected in some other species, such as in rice (Singla et al. 2009), maize (Baudino et al. 2001), wheat (Singh and Khurana 2012), Medicago truncatula (Nolan et al. 2009), and Citrus unshiu Marc. (Shimada et al. 2005). Three critical cDNA fragments including ZmARF-GEP, ZmKHCP, and ZmTypA involved in the embryogenic callus induction were isolated by cDNA amplified fragment length polymorphism analysis in maize (Sun et al. 2012). All of the three genes showed a higher expression pattern in embryogenic callus than in non-embryogenic callus. In addition, the Arabidopsis leafy cotyledon (LEC) genes, AtLEC1 and AtLEC2, promoted the somatic embryogenesis when they transferred into tobacco (Guo et al. 2013) and also considered to be a key regulators of somatic embryogenesis in cassava (Brand et al. 2019). Previous study also found that carrot homolog of Arabidopsis LEC1 only expressed in somatic embryos (Yamamoto et al. 2005).
Miscanthus lutarioriparius, originated in China and a genus of C4 plant, has great potential as a biomass energy crop (Zhao et al. 2013). High efficiency regeneration from embryogenic callus system is a key bottleneck step for its transgenic engineering. Because of the wild distribution of M. lutarioriparius, it is hard for finding the suitable samples for tissue culture. Molecular marker-assistant breeding has focused on the use of random markers for quantitative trait locus (QTL) analysis of traits, such as single-nucleotide polymorphism (SNP), simple sequence repeats (SSR), and amplified fragment length polymorphism (AFLP) markers (Amini et al. 2016;Feng et al. 2016;Lu et al. 2015).
In this study, a total of 37 M. lutarioriparius samples were used for induction of callus. To investigate the mechanism of somatic embryogenic callus formation in M. lutarioriparius, five candidate genes were used to assess the differential expression between embryogenic and non-embryogenic calli which were differentiated by the morphology of the paraffin section with optical microscope. Three full-length candidate genes and two partial sequence genes were cloned based on the maize and sorghum genome sequence database. In addition, one of the genes showed a significant correlation between genotype and induction of embryogenic callus in M. lutarioriparius. This study may lay the foundation of the molecular mechanism on the embryogenic callus induction of M. lutarioriparius and, perhaps, provide some gist for further study on genetic manipulation.

Preparation of biological materials
Thirty seven samples of M. lutarioriparius were grown in Ezhou, Hubei Province of China, and used for this study. The information of samples is accessioned in Table S1 and the distribution is shown in Fig S1. Their immature inflorescence tissues were harvested in the middle of September and were then surface-sterilized by washing with 75% ethyl alcohol for 30 s, 0.1% (w/v) mercuric chloride (HgCl 2 ) for 1.5 min, and rinsed thoroughly (at least three times) with sterile distilled water. Immature inflorescence tissues were placed onto the induction medium (4 mg/L 2, 4-D) according to the methods used by Zhao et al. (2013). The induced calli were transferred to the same fresh medium after 21 days and subcultured every 3 weeks. The characteristic of the calli induced from 37 samples were recorded after two rounds of subculturing. The induction frequencies were calculated by SPSS (Statistical Program for Social Sciences) 14.0. The samples with higher frequencies of embryogenic callus induction, i.e., I065 and I093, were used for RNA extraction, gene cloning, and expression study.

RNA extraction and cDNA synthesis
Plant materials were stored at − 80 °C and total RNA was extracted from embryogenic calli using RNAprep pure Plant Kit (TIANGEN BIOTECH., LTD, Beijing, China). DNase treatment was conducted prior to column purification of RNA according to the manufacturer ' s instructions (TIAN-GEN BIOTECH). cDNAs were synthesized using M-MLV Reverse Transcriptase (Promega Corporation, USA). PCR was carried out with KOD-Plus-Neo DNA Polymerase (TOYOBO CO., LTD, Osaka, Japan) with cDNAs as templates. PCR primers were designed from sequences ' data in the NCBI of Zeamays and Sorghum, which are shown in Table 1. The reaction mixture of cDNA synthesis contained 5 μl of 2 mM dNTPs, 5 μl of 10 × PCR Buffer for KOD Plus-Neo (TOYOBO, Osaka, Japan), 3 μl of 25 mM MgSO 4 , 15 pmol of a primer pair, 2 μl of cDNAs template, 2 μl of dimethylsulphoxide (DMSO), and 1 μl of KOD-Plus-Neo and the last supplementary 29 μl of ddH 2 O with a total volume of 50 μl. PCR amplifications were set to 36 cycles of amplification: pre denaturation at 94 °C for 2 min, and then denaturation at 98 °C with 10 s, annealing for 2-5 min at 68 °C (the criterion: 1000 bp about 1 min) and extension at 16 °C for 1 min as the last step. T4 ligase was used for PCR products cloning in pGEM-T vectors (Promega), following transformation to E. coli DH5α competent cells (TransGen, Beijing, China). Ten positive clones of E. coli in each species were selected to sequence. The plasmids of each gene were isolated using a plasmid extract kit (Tiangen) and stored at − 20 °C for standby application in other experiments.

Sequence alignment and phylogenetic analysis
We used online BLAST server (https ://blast .ncbi.nlm.nih. gov/Blast .cgi) to detect the identity of the candidate genes. ORF Finder (https ://www.ncbi.nlm.nih.gov/proje cts/gorf/) was selected for open-reading frames (ORFs) prediction. We performed sequence alignments including DNA and protein by ClustalX 2.1 Program (Larkin et al. 2007). The phylogenetic analysis of candidate genes conducted by MEGA 5.1 (Tamura et al. 2011). To identify the conserved domains and functional sites, InterPro (https ://www.ebi.ac.uk/inter pro/) was selected as the tool. ARF-GEPs, KHCPs, and TypAs from other species, including both monocots and dicots (Table S2, S3, and S5), were chosen for phylogenetic analyses. SERKs from other species, including monocots, dicots, mosses, and ferns, were chosen for phylogenetic analyses (Table S4). A basic neighbor-joining tree was built by MEGA 5.1.

Cloning of MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, and MlTypA
By degenerate primerr amplification designed on sequences of Sorghum and maize, we cloned the full-length ORF of cDNA sequences of three genes and named them MlARF-GEP, MlSERK1, and MlSERK2 (GenBank accession numbers KU640196, KU640198, and KU640199) based on the obtained cDNA. However, the other two genes were only got the partial ORF of cDNA sequences and named MlKHCP and MlTypA (GenBank accession numbers KU640197 and KU640200) based on the obtained cDNA. All the genes of the PCR amplification results are shown in Fig. 2. The complete ORF sequence of MlARF-GEP is 5385 bp long

Sequence analysis of MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, and MlTypA
The protein sequence and length of the MlARF-GEP was quite conserved, compared to the other monocots (Brachypodium distachyon, Oryza brachyantha, Oryza sativa Japonica Group, Oryza sativa, Setaria italic, Sorghum bicolor, and Zea mays) and dicots (Arabidopsis thaliana, Brassica napus, Malus domestica, and Ricinus communis) by multiple alignment analysis (Fig S2). The highest identity was 99% (SbARF-GEP) as compared to MlARF-GEP. The neighborjoining tree built by MEGA 5.1 showed that MlARF-GEP was separated into monocots and dicots groups, respectively (Fig. 3a). It means that MlARF-GEP evolved independently after differentiation of monocot and dicot plants. In addition, MlARF-GEP exhibited a closest genetic relationship with SbARF-GEP (Fig. 3a). Multiple alignment analysis of MlKHCP was conserved with both in the monocots and dicots (Fig S3), and the neighbor-joining tree showed a similar characteristic with MlARF-GEP; MlKHCP exhibited a closest genetic relationship with SbKHCP (Fig. 3b). Furthermore, MlSERK1 and 2 were also verified to be conserved in the sequence and length of protein as compared to those monocots (Oryza sativa Indica Group, Sorghum bicolor, Triticum aestivum, and Zea mays), dicots (Arabidopsis thaliana, Citrus sinensis, Glycine max, Gossypium hirsutum, Nicotiana benthamiana, Solanum lycopersicum, and Vitis vinifera), mosses (Physcomitrella patens), and ferns (Selaginella moellendorffii) (Fig. S4). The neighborjoining tree built by MEGA 5.1 showed that MlSERK1 and 2 was separated into four key groups including monocots, dicots, mosses, and ferns, respectively (Fig. 3c). MlSERK1 exhibited a closest genetic relationship with SbSERK2, and MlSERK2 exhibited a closest genetic relationship with ZmSERK2 and SbSERK3 (Fig. 3c). Multiple alignment analysis of MlTypA was conserved with both in the monocots and dicots (Fig. S5), and the neighbor-joining tree showed a similar characteristic with MlARF-GEP, MlTypA exhibited a closest genetic relationship with SbTypA (Fig. 3d).

Expression patterns of five genes in two types of embryogenic calli and non-embryogenic calli
The primers amplification ability of MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, MlTypA, and 18sRNA1 was examined specifically (Fig. S6). The six primer amplification efficiencies were 107.18%, 92.64%, 94.84%, 98.03%, 100.33, and 106.00%, and their standard curves are shown in Fig. S6. The R 2 values of the six standard curves were all between 0.99 and 1.0 (Fig. S7). Five genes were expressed in all developmental stage in two types of calli. As we expected, the similarities of the five genes expression pattern were obtained in four callus developmental stages. The highest expression levels of MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, and MlTypA were observed in the third week after induction in embryogenic callus (Fig. 4). Furthermore, the lowest expression levels were found in the fourth week. In addition, the relative expression levels in embryogenic calli were detected to exceed than in non-embryogenic calli among four developmental stages, the expression of MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, and MlTypA in embryogenic calli exhibited approximately several hundred folds higher than in non-embryogenic calli. Interestingly, the expression of five genes plummeted in the fourth week.

Polymorphism analysis of SSR markers in five genes
The induction frequencies and descriptive of calli from 37 individuals of M. lutarioriparius are shown in Table S1. The embryogenic calli were induced in only nine individuals. According to the five genes in Sorghum genome database, there were no SSR loci detected by SSR Hunter 1.3 analysis in MlARF-GEP and MlTypA genes. Although five SSR loci in MlKHCP and one SSR loci in MlSERK2 were detected, no polymorphism was observed. Interestingly, two SSR loci, MlSERK1-3 ( Fig S8) and MlSERK1-4, amplified three alleles in 37 M. lutarioriparius individuals. The variation from MlSERK1-3 showed significance with the induction characteristic, but there is no significance from MlSERK1-4 (Table 3). The individuals which could be

Discussion
Immature inflorescence tissues are frequently used as to the explants in Miscanthus species tissue culture (Holme and Petersen 1996;Zhang et al. 2012); however, the genotype of the materials, the cultural condition of the calli, and the concentration of the plant growth regulator in the medium were simultaneously affected the growth and development of the embryogenic calli (Glowacka et al. 2010). The same results were also exhibited in rice (Niroula et al. 2005) and turfgrass (Salehi and Khosh-Khui 2005), and inflorescence length was one of the reasons of embryogenic calli induction in orchard grass (Çeliktaş et al. 2014). In this study, a total of 37 individuals of M. lutarioriparius were used for callus induction; we examined very different induction frequencies (  (Holme and Petersen 1996;Zhao et al. 2013), and thus, the expression pattern of MlSERK1 and 2 demonstrated the above method from molecular level. ZmARF-GEF is reported to belong to the ADP ribosylation factor (ARF) family; it plays function relays on the alternation between GDP and GTP (Morinaga et al. 1996). In maize, ZmARF-GEF was deduced to associate with the auxin flowing and polarization, which further influenced the formation of embryogenic calli (Sun et al. 2012). MlARF-GEP was closely related to ZmARF-GEF. In addition, according to the expression pattern in two types of callus of MlARF-GEP, it implied that MlARF-GEP in M. lutarioriparius and ZmARF-GEF in Zea mays played a similar role.
KHCP may be a kinesin heavy chain like protein, that can move along cytoskeletal fibers by hydrolyzing ATP to generate energy (Vallee and Shpetner 1990), and they play a significant role in chromosome segregation, assisting the spindle to play function during mitosis and meiosis (Sawin Error bars indicate standard deviation of three biological replicates. "*"P < 0.05,"**"P < 0.01  (Sun et al. 2012). Homologous sequence alignment analysis showed that the MlTypA was highly similar with TypA (Tyrosine phosphorylation protein A) proteins from the other species. Phylogenetic analysis showed that MlTypA is closed to SbTypA and ZmTypA. TypA is a new protein of the ribosome-binding GTPase superfamily, which involved in various regulatory pathways (Margus et al. 2007). AtTypA genes are involved in pollen tube growth in Arabidopsis (Lalanne et al. 2004). A TypA cDNA fragment is isolated from embryogenic calli in maize; they suggest that ZmTypA may perform and regulate similar roles with other species (Sun et al. 2012).
Our study suggested that MlARF-GEP, MlKHCP, MlSERK1, MlSERK2, and MlTypA genes participated in the process of embryogenic calli formation. In conclusion, the qRT-PCR technique provided valuable information on distinguishing of the gene expression levels between embryogenic calli and non-embryogenic calli. Five candidate genes involved in the formation of embryogenic calli were cloned. The identification of these candidate genes helped to understand the underlying mechanisms related to somatic embryogenesis. However, the clear functions of these candidate genes remain unclear. An effective plant regeneration system has been established, the transformation system is developing in our lab. Further work, the exact role of the five genes should be explored by RNA interference or overexpression technology in M. lutarioriparius.