Isolation and structural analysis of the Bx6 and Bx7 genes controlling the biosynthesis of benzoxazinoids in rye (Secale cereale L.)

Benzoxazinoids (BXs) are plant secondary metabolites, first discovered in the 1950s, which are synthesized in many monocotyledonous species from the Poaceae family and in several dicotyledonous plants. They constitute a significant element of the plant’s defence mechanism against both biotic (pests and diseases) and abiotic (elevated salinity, heavy metals) stresses. The aim of this research was to gain more information on the genetic background of BXs biosynthesis in rye (Secale cereale L.) by isolating and sequencing the ScBx6L318 and ScBx7L318 genes encoding 2-oxoglutarate-dependent dioxygenase and 7-O-methyltransferase, respectively. Using the modified Amplicon Express method, BAC clones containing the ScBx6L318 and ScBx7L318 genes were isolated and sequenced. The similarity between the ScBx6L318 and ZmBx6 genes were, correspondingly, 78% and 68% in the coding sequence (cds) and the amino acid sequence (AA). A lesser similarity was found between the ScBx7L318 and ZmBx7 genes (72% and 45% at the cds and AA levels, respectively).

1 3 56 Page 2 of 9 et al. 2015), as well as ScBx1-ScBx6-like (cds, Tanwir et al. 2017), ScGlu (protein sequence, Nikus et al. 2003), and ScGT (mRNA sequence, Sue et al. 2011). Recently, Groszyk et al. (2017) proved the biological function of the ScBx1 gene using the virus-induced gene silencing approach. To date, only the ScBx6-like gene cds has been isolated in rye (Tanwir et al. 2017). Moreover, the existence of the ScBx6 and ScBx7 genes was questioned by, for example, Frey et al. (2003), despite the confirmed synthesis of DIMBOA. Therefore, we assumed that both genes must be present in the rye genome.

Plant and biological material
The BAC library used for isolation of the ScBx6 L318 and ScBx7 L318 genes, constructed from the DNA of rye inbred line L318, consisted of 39 superpools, each containing 2688 clones.
DNA for primer verification was isolated from the leaves of winter rye (Secale cereale L.) inbred line L318 using a CTAB method described by Murray and Thompson (1980). Plants were grown in a Conviron CMP 6050 climate-controlled chamber (photoperiod 16 h day/8 h night, temperature 22/20 °C, light intensity 200 μE, humidity 70%) for 2 weeks.

Isolation of positive BAC clones and PCR
The positive clones, i.e., those containing searchable genes, were isolated using the modified Amplicon Express Strategy (https ://ampli conex press .com/produ cts-servi ces/scree ningservi ces/pools -andsu perpo ols). In the first stage, the PCR template was the gDNA (at a concentration of 100 ng/1 μl) of the 39 superpools. The next step was a PCR with plate (W), row (R), and column (T) pools as templates from the superpools selected in the first stage. The PCRs contained 500 ng of total genomic DNA, 3 μM of forward and reverse primers, 0.2 mM of dNTPs, 0.5 mM of MgCl 2 , 1 × PCR buffer, and 3 units of DreamTaq polymerase (Fermentas) in a total volume of 15 μl. Amplification was performed in a thermal cycler Gene Amp PCR System 9700 using the following conditions: (1) 94 °C for 1 min; (2) 94 °C for 30 s, 60 °C for 30 s, and 72 °C for 60 s for 35 cycles; and (3) 72 °C for 5 min. The products were separated on a 1% agarose gel, stained with ethidium bromide, and visualised on a UV transilluminator. Selected BAC clones were suspended in LB medium with chloramphenicol and sent to the GENOMED S.A. company, where MiSeq (Illumina) PE250 and CLC Genomic Workbench v. 7.5 were used for sequencing and assembly, respectively. The resulting contigs were analysed bioinformatically to identify the ScBx L318 genes.

Primer design
To isolate the ScBx6 L318 and ScBx7 L318 genes, primers were constructed based on the sequences of their maize orthologs ZmBx6 and ZmBx7 (GenBank Acc. No. AF540907 and EU192149, respectively). After establishing fragmentary sequences of the ScBx6 L318 and ScBx7 L318 genes and based on blast analysis against the rye genome sequence (Bauer et al. 2017, GenBank assembly: GCA_900079665.1), ryespecific primers were designed. Four primer pairs were used for BAC library screening based on two selected amplicons per gene (Table 1). The designed primers were synthesised at GENOMED S.A. in Warsaw.

Bioinformatic analysis
Sequence analysis was performed using the Sequencher 4.5 program. BioEdit 7.0.9.0 software was used to identify the ScBx6 L318 and ScBx7 L318 genes in positive BAC clones. SoftBerry/FGENESH/Monocot plants (generic, corn, rice, wheat, barley), (https ://linux 1.softb erry.com/), (Solovyev  2007) software was employed to determine the structure of the analysed genes. The I-TASSER program was used to create a structure model of proteins encoded by the ScBx6 L318 , ScBx6 Lo7 , ScBx6-like PICASSO , ScBx7 L318 , and ScBx7 Lo7 genes (https ://zhang lab.ccmb.med.umich .edu/I-TASSE R/), (Zhang 2008;Roy et al. 2010;Yang et al. 2015). The cDNA sequences were changed into amino acid sequences and subjected to I-TASSER analysis. Out of the five models, the greatest scoring I-TASSER model was chosen based on the C-score. 1 Phylogenetic trees were constructed based on the cds of genes encoding dioxygenases and O-methyltransferases from the NCBI and ENA databases. These were created with the use of the MEGA6 program (Tamura et al. 2013), Neighbor Joining algorithm (Saitou and Nei 1987), and the Maximum Composite Likelihood method (Nei and Kumar 2000), with a bootstrap (Felsenstein 1985) value of 1000. The other applications applied in the bioinformatic analysis were: • BLASTN 2.7.1+ (Zhang et al. 2000)-to determine the localization of the ScBx6 L318 and ScBx7 L318 genes on rye chromosomes • 'Clustal Omega'-for a sequence comparison to find the SNP and INDEL polymorphisms (https ://www.ebi.ac.uk/ Tools /msa/clust alo/) • 'EMBOSS Transeq'-to rewrite the nucleotide sequence into an amino acid sequence (https ://www.ebi.ac.uk/ Tools /st/embos s_trans eq/) • ENA database (www.ebi.ac.uk/ena)-to find the Bx6 and Bx7 genes in wheat and to analyse the rye genome sequence • NCBI database (https ://ncbi.nlm.nih.gov/)-to analyse the rye genome sequence and to find the Bx6 and Bx7 genes in line Lo7 • NCBI database blast algorithm-to find highly similar sequences to the analysed sequence and to determine the degree of similarity between the compared nucleotide and the amino acid sequences (https ://blast .ncbi.nlm.nih. gov/Blast .cgi) • PCRPrimerStats program (Stothard 2000)-to assess the quality of the constructed primers (the presence of socalled "hairpins") • 'PlantCare'-for an analysis of the ScBx6 L318 and ScBx7 L318 gene promoters available within 749 bp upstream from the ATG codon (Lescot et al. 2002), (https ://bioin forma tics.psb.ugent .be/webto ols/plant care/html/) • Primer3 v. 0.4.0 (Koressaar and Remm 2007;Untergasser et al. 2012)-for the primer design • 'showseq'-to determine the sequence length (https :// embos s.bioin forma tics.nl/cgi-bin/embos s/shows eq) • the application is available at: https ://eu.idtdn a.com/calc/ analy zer-to assess the tendency to create homo-and hetero-dimers in the constructed primers

Results
Searching through the BAC library allowed the identification of the ScBx6 L318 and ScBx7 L318 genes in two separate superpools for each gene. The plate, row, and columnar pools from four superpools were searched to determine the coordinates of the clones containing the studied genes. An analysis of the results determined on which plates the clones containing the tested genes were located.

Structure and localization of the ScBx6 and ScBx7 genes
Based on the bioinformatic analysis, the length and structure of the ScBx6 L318 and ScBx7 L318 genes were predicted.

Phylogenetic analysis of Bx genes
Two phylogenetic trees were constructed based on the cds of genes encoding dioxygenases (Fig. 1a) and O-methyltransferases (Fig. 1b).
The ScBx6 L318 , ScBx6 Lo7 , and ScBx6-like PICASSO genes were compared to three genes from maize and one gene from S. italica encoding dioxygenases. Additionally, one wheat gene (Acc. No. CBD24305, ENA database), designated by us as the TaBx6-like gene, was included in the analysis. The homology range for the TaBx6-like gene to other Bx6 genes, in cds and at the AA level, was 77-93% and 67-92%, correspondingly (supplementary materials, D1-D2). On the resulting phylogenetic tree, the ScBx6 L318 , ScBx6 Lo7 , and ScBx6-like PICASSO genes along with the wheat TaBx6-like gene were distributed in two strongly supported branches (bootstrap values of 92 and 97%). The remaining part of the tree was composed of three branches, with more distant sequences of dioxygenases from S. italica and maize.
The ScBx7 L318 and ScBx7 Lo7 genes were compared to nine genes encoding O-methyltransferases, (i.e., six genes from maize, two genes from wheat, and one gene from S. italica). A wheat gene named by us as TaBx7-like (GenBank Acc. No. AK330822) showed 76% identity in cds and 64% at the AA level to both the ScBx7 L318 and ScBx7 Lo7 genes and 69% and 44% to ZmBx7, respectively (supplementary materials, E1-E2). The analysed genes were grouped in four strongly supported (with bootstrap values of 100%) clusters. An apparent separation of ScBx7 L318 and ScBx7 Lo7 together with the TaBx7-like gene from the remaining three clusters was observed. Furthermore, one from the remaining three clusters, which contained gene from S. italica and ZmBx7, was the least distant from ScBx7 L318 , ScBx7 Lo7 and TaBx7like gene cluster.

In silico analysis
To confirm the function of the newly isolated genes of line L318, a bioinformatic analysis of the 749 bp 5′ regulatory sequence, including the core promoter, proximal and distal regulatory elements, was performed. In both genes, stressspecific motifs (SSM) were identified. For ScBx6 L318 eleven SSM were found: cis-acting regulatory elements involved in MeJA-responsiveness (2), cis-acting elements involved in abscisic acid responsiveness (7), cis-acting element involved in low-temperature responsiveness (1), and cis-acting element involved in defence and stress responsiveness (1). In turn, for ScBx7 L318 only two SSM were identified: cis-acting element involved in low-temperature responsiveness (1), and MYB binding site involved in drought-inducibility (1), (Table 2).

Discussion
Since the discovery of BXs in 1955 (Virtanen and Hietala 1955a, b), information regarding genes encoding enzymes involved in the biosynthetic pathway of these compounds has gradually extended. Although BXs were identified first in rye, the genetic background of biosynthesis of these compounds was practically unknown until 2013, with the exception of the ScBx1-ScBx2 (La Hovary 2011), Scglu (Nikus et al. 2003), and ScGT (Sue et al. 2011) genes. After 2013, new Bx genes as well as the updated version of Bx sequences that were known previously were published. Five ScBx genes (ScBx1-ScBx5) of rye inbred line L318 were isolated and sequenced . Recently, Tanwir et al. (2017) published the coding sequences of the ScBx1-ScBx6 genes from cultivar PICASSO. The present study led to the isolation and prediction of the structure and function of two consecutive rye genes controlling BXs biosynthesis, namely ScBx6 and ScBx7, which are orthologs of ZmBx6 and ZmBx7 in maize, respectively. According to Frey et al. (2003), the orthologs of the Bx6 and Bx7 genes were lost in rye and wild barley species, which in turn caused the dominant aglucon in these species to become DIBOA, whereas in maize and wheat it became DIMBOA. However, Frey et al. did not present an alternative pathway for the synthesis of DIMBOA, which was proven by many authors to be present in rye (e.g., Rice et al. 2005;Zasada et al. 2007;Meyer et al. 2009;Rakoczy-Trojanowska et al. 2017). To date, TRIBOA-Glc, an intermediate product between DIBOA-Glc and DIMBOA-Glc, has not been found in rye, which some researchers (Frey et al. 2003) saw as proof that the BXs biosynthetic pathway in rye ends on BX5. It can be hypothesised that in rye the synthesis level of TRIBOA-Glc is much less than in maize, which makes its detection difficult.
The ScBx6 L318 and ScBx6 Lo7 genes, based on the bioinformatic prediction, contained three exons and two introns. However, based on Frey et al. (2003), and in accordance with our results of the maize genome analysis deposited in the NCBI database (GCF_000005005.2, 6.09.2018), the ZmBx6 gene has no introns. The ZmBx6 gene seemed to be unique in the Bx6 ortholog group given that the other Bx genes have at least one intron. On the basis of our comparative analysis of these sequences and the genomes of several Poaceae species, it was determined that Bx6-like genes in Ae. tauschii and S. italica had two introns, and in S. bicolor none (data not shown). The cds of ScBx6 L318 , ScBx6 PICASSO , and ZmBx6 were identical in length (1125 bp), in contrast to ScBx6 Lo7 , in which the cds was 12 bp longer.
Despite the very great similarity of ScBx6 L318 to the ScBx6-like PICASSO (99%) and ScBx6 Lo7 (98%) genes in terms of cds, significant differences between their protein models were detected. An insertion in the first exon of ScBx6 Lo7 (374 aa to 378 aa) resulted in one additional α-helix in relation to ScBx6 L318 and ScBx6-like PICASSO . However, the cause for differences in the arrangement of four α-helixes and two β-helixes were most likely associated with two non-conservative substitutions (Ser L318 /Arg PIC and Ala L318 /Thr PIC ) within the third exon of ScBx6-like PICASSO and four and two non-conservative substitutions in the first (Met L318 /Leu Lo7 , Gly L318 /Arg Lo7 , Val L318 l/Thr Lo7 , Val L318 l/ Thr Lo7 ), and the third (Glu L318 /Asp Lo7 , Val L318 /Leu Lo7 ) exon, respectively, in the case of ScBx6 Lo7 . Tanwir et al. (2017) demonstrated that the ScBx6-like PICASSO enzyme allows for the conversion of DIBOA-Glc to TRIBOA-Glc. Despite the structural differences and chemical properties of the ScBx6-like PICASSO and ZMBX6 proteins, they must have the same function, which suggests that the transformation of DIBOA-Glc into TRIBOA-Glc takes place in both rye and maize (Tanwir et al. 2017). The great identity in cds and at the AA sequence level indicates that the ScBx6 L318 and ScBx6 Lo7 proteins have the same catalytic properties as in the PICASSO cultivar. It can be assumed that the orthologs of the Bx6 and Bx7 genes that control the BXs biosynthetic pathway in maize are also present in rye.
Based on the bioinformatic prediction, ScBx7 L318 ScBx7 Lo7 both have two exons and one intron. Blast analysis of the ZmBx7 gene sequence (GenBank Acc. No. EU192149) against the maize genome in the NCBI database (GCF_000005005.2, 6.09.2018) proved that the ZmBx7 gene also had the same structure (two exons and one intron). The gene cds in both rye lines was 1098 bp, while for ZmBx7 it was 1161 bp. Both the ScBx7 L318 and ScBx7 Lo7 genes showed 72% identity to the ZmBx7 gene at the cds level. The great resemblance of the coding sequence, its length, and the same structure of the ScBx7 L318, ScBx7 Lo7 , and ZmBx7 genes allowed us to assume that these genes are orthologs and have the same function. The catalytic properties of BX7 have been known so far only in maize. Both the ScBx7 L318 and ScBx7 Lo7 genes encoded a protein 26 amino acids less than ZMBX7 (365 vs 391), and showed a similarity of 45% at 99% coverage. The SCBX7 L318 and SCBX7 Lo7 protein models had differences in the arrangement of two α-and β-helixes, the source of which were six SNPs associated with non-conservative substitutions (Ala L318 /Pro Lo7 , Thr L318 /Ala Lo7 , Ser L318 /Asn Lo7 , Lys L318 /Arg Lo7 , Asn L318 / Lys Lo7 , Thr L318 /Ser Lo7 ) within the first exon. Structural elements such as elongated and/or additional α-helixes could be related to differences in activity and enzymatic efficiency in relation to their orthologs, and requires further research.
The phylogenetic analysis of the cds of genes encoding dioxygenases revealed that the ScBx6 L318 , ScBx6 Lo7 , ScBx6like PICASSO , and TaBx6-like genes are orthologs of ZmBx6, although in the case of the TaBx6-like genes more experimental data will be necessary to determine their function.
The phylogenetic analysis of the cds of genes encoding O-methyltransferases deposited in the NCBI database and those obtained in this study revealed that the wheat gene designated as TaBx7-like was located within the most distant cluster encompassing the ScBx7 L318 and ScBx7 Lo7 genes. This gene was similar to both the ScBx7 L318 and ScBx7 Lo7 genes, as well as to the other analyzed genes encoding O-methyltransferases. Therefore, it can be assumed that TaBx7-like encodes O-methyltransferase and, presumably, it controls the conversion of TRIBOA-Glc to DIMBOA-Glc in wheat. The second wheat gene (GenBank Acc. No. U76384.1) included in the analysis was located on a separate branch than the rest of the genes which encode O-methyltransferases, and was not as strongly supported as others (59 vs. 99-100%). Moreover, it showed a lesser level of similarity, both on the cds and at the AA level, to the ScBx7 L318 , ScBx7 Lo7 , and ZmBx7 genes (supplementary materials, F1-F2). Therefore, this gene most likely is not involved in BXs biosynthesis in wheat. On the basis of the obtained results, it can be supposed that the newly isolated ScBx7 L318 gene codes for O-methyltransferase, which takes part in the BXs biosynthesis pathway in rye, analogously to ZmBx7 in maize. However, it will be necessary to study the enzyme's catalytic properties and its ability to catalyse TRI-BOA-Glc and TRIMBOA-Glc conversion to DIMBOA-Glc and DIM 2 BOA-Glc, respectively, to confirm the hypothesis formulated above.
To predict the probable functions of the newly isolated ScBx6 L318 and ScBx7 L318 genes, we have performed the promoter in silico analysis of their upstream regulatory sequences. Currently, the analysis of 5′ cis regulatory sequences is often used to gain information regarding the function of the investigated genes (e.g., Guo et al. 2017;Wang et al. 2017;Nawkar et al. 2017;Charfeddine et al. 2017). By applying such analysis, Guo et al. (2017) proved that the protein encoded by the BplMYB46 gene, which binds to the MYBCORE and AC-box motifs, plays an important role in secondary cell wall synthesis and response to abiotic stress in Betula platyphylla. The identification of three cis GT-1 elements in the promoter sequence of the Cucumis sativus CsSAMs gene proved its role in response to elevated salinity stress .
In the promoter sequences of the ScBx6 L318 and ScBx7 L318 genes several stress-specific motifs were identified, which clearly indicates their roles in plant defence. However, cisacting regulatory elements associated with abscisic acid (ABA) responsiveness, defence and stress responsiveness, and MeJA-responsiveness were found in ScBx6 uniquely, when the MYB binding site associated with drought-inducibility was identified in the 5′ upstream region of ScBx7 exclusively. MeJA is responsible for launching systemic resistance mechanisms. After wounding or contact of the plant with the pathogen, a significant increase in the MeJA concentration is observed (Rates et al. 2009). Part of the response to pathogens triggered by MeJA is the accumulation of secondary metabolites. The expression of genes involved in secondary metabolism is increased when the primary metabolism is inhibited (Paudel et al. 2014). According to Schaller et al. (2005) MeJA leads to degradation of chlorophyll a and b and inhibits the activity of ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO). The chlorophyll decomposition process is accompanied by lesser photosynthetic efficiency. This regulator stimulates photorespiration, peroxidise activity, and is involved in stomatal closure (Schaller et al. 2005). The motifs connected with the MeJA signalling pathway were present in the majority of ScBx promoter sequences cloned and analysed to date, with the exception of ScBx3 L318 (characterized previously by Bakera et al. 2015) and ScBx7 L318 (described in this work). ABA, another important hormone, affects tolerance to many abiotic stresses, such as the presence of heavy metals, drought, high temperature, coldness, elevated salinity, or irradiation, and the amount of this regulator increases in response to stress occurrence (Vishwakarma et al. 2017). Other functions exhibited by ABA and mentioned by Vishwakarma et al. (2017) are inhibition of seed germination and impact on stomatal movement. In the ScBx6 L318 gene 7 SSM associated with the ABA signalling pathway (ABRE) were found. The frequency of those SSM were greater in comparison to that located in regulatory sequences of other Bx genes from the Poaceae family and ranged from 1.75 up to 7.0-fold. The number of all motifs differed as well, and showed the greatest value in the ScBx6 L318 gene and ranged from 1.83 up to 11-fold (supplementary materials, Table G-S). Interestingly, no ABRE motifs were identified in the ScBx1 L318 -ScBx5 L318 genes  and in the ScBx7 L318 gene (described in this work). Zhang et al. (2017) reported the occurrence of at least one drought stressresponsive cis-element (including i.e., a MBS-MYB binding site, involved in drought-inducibility and low-temperature response-LTR) in the promoter sequences of maize drought-responsive genes. The ScBx7 L318 gene had either SSM types, while in other analyzed genes only MBS (Bx3 orthologs) or LTR (ScBx6 L318 , AsBx4 and AtBx4 genes) or neither (ZmBx genes) motif have been found (supplementary materials, Table G-S). The presence of motifs related to the MeJA relay pathway and other important cis-acting elements in the promoter sequence of the ScBx6 L318 and ScBx7 L318 genes clearly indicates their role in rye defence strategy. Those results ensure us in the conviction that there is potential in, for example, selection of lines with great BXs content towards more resistant cultivars to abiotic and biotic stresses. Rakoczy-Trojanowska et al. (2017) showed that polymorphisms in ScBx gene sequences, especially the most valuable SNPs (ScBx4_1702 and ScBx5_1105) unaffected by environmental factors, were associated with pre-harvest sprouting resistance (PHS-R) and greater GDIMBOA, MBOA, and HBOA content, respectively.
The obtained results extended the knowledge of the genetic background of BXs biosynthesis in rye by adding the characteristics of the next two genes. We predicted their probable function and we have established their chromosome localization. Plants with the ability to synthesize BXs, from the Poaceae family, differ in regards to the number of involved genes controlling the BXs pathway. The fewest number is known in H. lechleri (5) and the greatest in maize (16). In three species (maize, wheat, and rye) all Bx genes are divided between four chromosomes, however, maize possesses four clusters and wheat and rye only 2. Based on our 56 Page 8 of 9 results we postulate that the BXs biosynthesis pathway in wheat and rye could be analogous to that in maize, not only till DIBOA synthesis but in subsequent steps as well.

Conclusions
1. The ScBx6 L318 and ScBx7 L318 rye genes are most probably the orthologs of the ZmBx6 and ZmBx7 genes, respectively. 2. Based on the in silico analysis of Bx6 and Bx7 orthologs in wheat and maize, it can be proposed that the DIM-BOA-Glc biosynthetic pathway is similar in both species.
Author contribution statement Both authors contributed to the text. The experimental results, figures, tables, and supplementary materials were provided by BB.