Introduction

Interactions of insect pests with plant hosts elicit a myriad of molecular responses in both the host and the insect, as the plants mount a defensive or offensive response and the pests try to breach the defenses or protect themselves from detrimental chemical defenses. In response to herbivory, plants produce an array of secondary metabolites as defense molecules that are harmful to insects. These toxins adversely affect the insect growth, performance, reproduction and survival (Theis and Lerdau 2003; Leicach and Chludil 2014; Wink 2018). Over the course of evolution, phytophagous insects have developed a sophisticated detoxification system to counter and cope with these harmful toxins. This complex system is comprised of a group of biotransformation enzymes, including carboxylesterases, cytochrome P450s, glutathione S-transferases, and uridine diphosphate (UDP)-glycosyltransferases (Li et al. 2007). These enzymes: (i) prevent the damage to biological molecules within the insect midgut; (ii) excrete the toxic compounds; or (iii) directly metabolize the plant toxins (War et al. 2019). Generally, these enzymes are present at low levels, but are induced when insects ingest toxic metabolites (War et al. 2013).

UDP-glycosyltransferases (UGTs) are a superfamily of enzymes (EC2.4.1.17) that catalyze the transfer of glycosyl residues from activated nucleotide sugars to acceptor molecules to produce glycosides. Insect UGTs use UDP-glucose as the sugar donor for glucosidation of exogenous substrates (Ahmad and Hopkins 1993; Rausell et al. 1997). UGTs are implicated in detoxification of plant secondary metabolites encountered by many insects in their diet (Després et al. 2007). In tobacco hornworm (Manduca sexta), UGTs metabolize plant phenolics (Ahmad and Hopkins 1993). BmUGT1, a UGT identified from silkworm (Bombyx mori) degrades the secondary metabolites, flavins and coumarins (Luque et al. 2002). Besides their role in detoxification, UGTs also play roles in other physiological processes, including cuticle formation, pigmentation, and olfaction (Kramer and Hopkins 1987; Hopkins and Kramer 1992; Wang et al. 1999).

The Hessian fly (Mayetiola destructor [Say]), belonging to the order Diptera (Family Cecidomyiidae) is an obligate parasite of host wheat (Triticum aestivum) causing significant economic losses (Schmid et al. 2018). In addition to the host wheat, Hessian fly larvae also attack several nonhosts resulting in severe physical and metabolic consequences (Chen et al. 2009; Hargarten et al. 2017; Subramanyam et al. 2019). The life cycle of Hessian fly begins with the newly hatched larvae (neonates) that migrate to the base (crown) of the plant where the first- and second-instar larvae feed on the abaxial surface of the developing leaf sheath. The wheat-Hessian fly interaction fits the gene-for-gene model (Hatchett and Gallun 1970) resulting in either a compatible or incompatible interaction. During incompatible interactions, the H (Hessian fly resistance) gene-mediated resistance is accompanied by increase in accumulation of transcripts encoding plant defense proteins (Sardesai et al. 2005; Subramanyam et al. 2006, 2008, 2013) that disrupt the insect midgut microvilli (Shukle et al. 2010) resulting in larval death (avirulent larvae) while the resistant plant (harboring the H gene) shows normal growth (Gallun 1977). However, during compatible interactions, larvae inject salivary effector proteins (Chen et al. 2006) that alter the host plant physiology and suppress plant defense responses (Baluch et al. 2012). Susceptibility in the host plant is induced as early as 1 day after egg hatch (DAH) and is characterized by: (i) increased accumulation of susceptibility-related transcripts (Puthoff et al. 2005; Liu et al. 2013; Subramanyam et al. 2015); (ii) formation of a nutritive tissue rich in amino acids, proteins and sugars (Harris et al. 2006; Saltzmann et al. 2008; Subramanyam et al. 2015, 2018); and (iii) increased cell wall permeability that facilitates diffusion of nutrients to the feeding sites (Williams et al. 2011; Nemacheck et al. 2019). Successful establishment of virulence allows the larvae (virulent) to develop and complete their life cycle while the susceptible (lacking H gene) host wheat is stunted (Byers and Gallun 1972).

While a few previous studies document the identification and role of cytochrome P450s (Mittapalli et al. 2005) and GSTs (Yoshiyama and Shukle 2004; Mittapalli et al. 2007) in detoxification of the wheat allelochemicals by Hessian fly, to the best of our knowledge there is no report that identifies the involvement of UGTs in the M. destructor detoxification system. The focus of this study was to identify and profile the expression of UGT genes from the M. destructor genome. Using the recently assembled Hessian fly genome (https://www.ncbi.nlm.nih.gov/search/all/?term=Mayetiola%20destructor, Zhao et al. 2015), genes encoding putative UGT members were identified. Expression profile of UGTs revealed differential expression patterns in virulent Hessian fly. Of these, MdesUGT1, a novel UDP-glycosyltranferase, showed significant up-regulation to several hundred folds during compatible interactions (susceptible plant) in contrast to incompatible interactions that lacked significant expression. A similar profile was also observed in Hessian fly feeding on nonhost Brachypodium distachyon. Our results suggest a potential role of MdesUGT1 during Hessian fly virulence, possibly by detoxification of plant secondary metabolites during feeding, thereby allowing unhindered growth and development of virulent larvae on susceptible plants.

Materials and methods

Insect and plant material

The Hessian fly (Mayetiola destructor) Biotype L and Great Plains (GP) laboratory stocks were maintained in diapause in a 4 °C cold room as described by Sosa and Gallun (1973) at the USDA-ARS Crop Production and Pest Control Research Unit in West Lafayette, IN. The wheat (Triticum aestivum) lines “Molly” (harboring H13 resistance gene) and “Newton” (lacking H resistance gene) were used in this study. Biotype L feeding on Molly wheat yields an incompatible interaction (resistant wheat, avirulent larvae). Biotype L and GP feeding on Newton wheat yield a compatible interaction (susceptible wheat, virulent larvae). Brachypodium distachyon (Bd) seeds of line Bd21 were a gift from Roger Thilmony (Albany, USDA-ARS) and yield nonhost resistance to Biotype L infestation.

Plant growth and infestation

Four-inch pots containing Promix Professional growing medium (Premier Horticulture Inc., Quakertown, PA, US) were seeded (12 seeds per pot) with Newton or Molly wheat lines and placed in a Conviron growth chamber (Controlled Environments Limited, Winnipeg, Manitoba, Canada) set at 18 °C with a 16 h/8 h (light/dark) photoperiod (irradiance between 980 and 1470 μmol m−2 s−1) and 60% relative humidity. When the plants reached the two-leaf stage, pots were covered with vented plastic cups. To infest the plants, 3 female and 2 male Hessian flies were introduced into each pot resulting in infestation levels of 18 larvae per plant, on average. For Bd plants, 10 seeds were planted in each 4-inch pot containing equal volume (50:50 mix) of vermiculite (Perlite Vermiculite Packaging Industries, North Bloomfield, OH) and Farfard professional potting mix (Conrad Farfard Inc., Agawam, MA) in a Conviron growth chamber set at 18 °C with 24 h photoperiod and 60% relative humidity. When the plants reached the 3-leaf stage, they were infested with 10 female and 2 male Biotype L flies per pot as previously described (Hargarten et al. 2017).

Collection of insect tissue

For transcript profiling studies, neonate control larvae (Biotype L and GP) that had not fed on wheat plant tissue were collected from 30 infested Newton plants using the method described in Subramanyam et al (2018). First- (1 and 3 Days After Egg Hatch; DAH) and second-instar Biotype L (9, 14, 21 DAH) and GP (9 DAH) virulent larvae that had fed on Newton wheat plants, and first-instar (1 and 3 DAH) Biotype L avirulent larvae that fed on Molly wheat plants, were collected in three biologically replicated experiments from 30 plants per each treatment and time-point. Biotype L larval collections feeding on Bd plants included the neonates, first- (3 DAH) and second-instars (9 DAH) in three biologically replicated experiments. The crowns (feeding site) from infested wheat and Bd seedlings were dissected in deionized water under the microscope to dislodge the larvae and subsequently transferred into a 1.5 ml microfuge tube. The carried-over water in the tubes was gently pipetted out and larvae flash-frozen in liquid nitrogen. The samples were stored at − 80 °C until further use.

Identification of M. destructor UGT members

Protein sequences for genes encoding UGT in reference insect genomes were obtained from FlyBase for Drosophila melanogaster (www.flybase.org; McQuilton et al. 2012) and Vector Base for Anopheles gambiae and Aedes aegypti (www.vectorbase.org; Lawson et al. 2009). These sequences (Supplementary Table S1) were used as queries in the BLASTp program (https://blast.ncbi.nlm.nih.gov; Altschul et al. 1990) searches to extract orthologs from the M. destructor (GP biotype) genome assembly (Zhao et al. 2015) as described in Shreve et al. (2013). Briefly, the top hits of the query sequences (D. melanogaster, An. gambiae, or Ae. aegypti) from the M. destructor genome assembly were assigned unique MdesUGT numbers. It was possible for multiple sequences from D. melanogaster, An. gambiae, or Ae. aegypti to extract the same MdesUGT gene (Supplementary Table S1). All MdesUGT sequences identified in the Hessian fly genome assembly were further assessed by JBrowse, a genome viewer that contains gene annotations automatically generated by gene prediction software. The coding sequences for all the candidate genes encoding UGT protein were extracted along with the details of chromosome/scaffold location. To confirm that the extracted sequences encode for UGT enzymes, the putative UGT sequences were analyzed by Conserved Domain (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) and Interpro (https://www.ebi.ac.uk/interpro/) domain analysis programs. The sequences were also subjected to BLASTp searches against the National Center for Biotechnology Information (NCBI) nr database to identify the top blast hit and E-value. The UGT family/subfamily noted for the top hit in the BLASTp results was assigned to the queried UGT sequence.

Quantification of UGTs identified from M. destructor genome

RNA was isolated from first- (1 and 3 DAH) and second-instar (9 DAH) GP larvae feeding on Newton wheat with TRIzol reagent (Invitrogen, Waltham, MA) according to the manufacturer’s protocol. Reverse transcription to generate cDNA for use in transcript quantification was conducted using random hexamers as described in Subramanyam et al. (2015). Transcript profiling of genes encoding putative UGTs was carried out by quantitative real-time reverse transcription PCR (qRT-PCR). Target-specific primers (Supplementary Table S2) were designed with Primer Express 3.0 Software from Applied Biosystems (ABI, Foster City, CA) using the JBrowse predicted coding sequences encoding putative UGTs, generated from the GP biotype genome assembly. The qRT-PCR was performed on a LightCycler 480 Instrument II (Roche Diagnostics Corporation, Indianapolis, IN). Reaction volume of 10 µl contained 5 µl of 2X LightCycler 480 SYBR Green I Master (Roche), forward and reverse gene-specific primers at a final concentration of 0.5 µM each, and 20 ng cDNA template. The PCR cycling parameters were 45 cycles of 95 °C for 10 s, 60 °C for 10 s, and 72 °C for 10 s. Amplification of a single product was confirmed through melt-curve analysis (95 °C for 5 s, 65 °C for 1 min, 97 °C continuous) following PCR. All qRT-PCRs were carried out in triplicate for each of the three biological replicates. Hessian fly 18S rRNA (NCBI accession number KC177284) was included in the qRT-PCR as the endogenous control (Subramanyam et al. 2018). No-template negative controls were included in each PCR plate. Quantification of transcript abundance was done using Relative Standard Curve method (ABI User Bulletin 2, ABI PRISM 7700 Sequence Detection System). Statistically significant differences in the relative expression values were determined using one-sided Tukey pairwise comparison (JMP Pro Ver. 14, SAS Institute Inc.). Differences were considered statistically significant at p < 0.05. Fold change was calculated as the ratio of transcript levels in virulent Hessian fly larvae that have started feeding at 1, 3 and 9 DAH to the neonates that are assumed to be nonfeeding.

Quantification of MdesUGT1 during compatible and incompatible interactions

MdesUGT1 transcripts were quantified and analyzed as described above in (i) Biotype L larvae feeding on Molly resistant wheat (incompatible interaction) 1 and 3 DAH; (ii) Biotype L larvae feeding on susceptible Newton wheat (compatible interaction) 1, 3, 9, 14 and 21 DAH and (iii) Biotype L larvae feeding on Bd plants (nonhost resistance) at 1, 3 and 9 DAH using the qRT-PCR primers designed (Supplementary Table S2) from the transcript sequence of Mdes008221 (MdesUGT1) extracted from GP genome assembly. 18S rRNA gene was used as the endogenous control (Supplementary Table S2).

Isolation of cDNA and genomic clones of MdesUGT1

Since the JBrowse gene prediction for Mdes008221 (MdesUGT1) was incomplete, 5′ and 3′ RACE PCRs were performed to obtain the near full-length cDNA clone for MdesUGT1 gene from M. destructor genome. The 5′- (5′-GATGGGCCAAAACGTCCGATTGTGGCAGC-3′) and 3′- (5′-CCGGGCCTGATCAATGTTGGCGGAGCAC-3′) RACE primers were designed from the partial JBrowse predicted Mdes008221 (MdesUGT1) cDNA sequence. The SMARTer RACE 5′/3′ kit (Takara, Mountain View, CA) was used with Advantage 2 polymerase (Takara) to amplify the cDNA ends from virulent Biotype L (collected from Newton wheat 4 DAH) RACE-ready cDNA according to the manufacturer’s protocol. PCR products were gel-purified with the NucleoSpin Gel and PCR Clean-Up kit (Takara) and cloned into the pCR4-TOPO TA vector using the TOPO TA Cloning Kit for Sequencing (Invitrogen, Waltham, MA). The cloned RACE products were sequenced by GENEWIZ (South Plainfield, NJ). Based on the RACE sequencing results, forward (5′-TTGGGTCAACGAGACGTGCAT-3′) and reverse (5′- GCAATTGTTTGCCATTCAGTTGGA-3′) primers were designed to amplify the coding and genomic sequences using cDNA and genomic DNA isolated from virulent Biotype L larvae (9 DAH) collected from Newton wheat as the template, respectively. The coding sequence of MdesUGT1 from virulent GP larvae (9 DAH) was also amplified with the above primers. A 25 µl reaction mixture was set up containing 1X PCR buffer, 2 mM MgSO4, 0.2 mM of each dNTP, 0.2 µM of each primer, 50 ng cDNA template, and 1 unit of Platinum Taq DNA Polymerase High Fidelity (Invitrogen). The PCR cycling parameters were as follows: 94 °C for 2 min; 35 cycles of 94 °C for 30 s, 52 °C for 30 s, 68 °C for 2 min (cDNA) or 4 min (genomic DNA); final extension of 68 °C for 5 min. Each amplicon was gel-purified, cloned and sequenced as described above. The coding and genomic sequences obtained were submitted to NCBI.

Characterization of MdesUGT1

Annotations and sequence similarity analyses of the MdesUGT1 sequence were done using BLAST programs available on the NCBI website (http://www.ncbi.nlm.nih.gov/). Predictions for isoelectric point (pI) and molecular mass were based on the entire ORF using the pI/MW tool at Expasy (www.expasy.org/tools.pi_tool.html). Conserved domain search was done using the BLASTp tools on the NCBI website and Pfam database (https://pfam.xfam.org). SignalP v3.0 (Center for Biological Sequence Analysis, Technical University of Denmark; http://www.cbs.dtu.dk.services.SignalP/) was used to predict signal peptides. The 3D structure of MdesUGT1 was modelled using Phyre2 (Protein Homology/Analogy Recognition Engine V2.0; Kelley et al. 2015).

Phylogenetic analysis of MdesUGT1

Phylogenetic analysis was undertaken to explore the evolutionary relationship of MdesUGT1 relative to orthologous genes encoding UGT protein from other species. The deduced protein sequence of MdesUGT1 identified from Biotype L (M. destructor) was used to query the NCBI nr protein database. The BLASTp search (Altschul et al. 1990) was carried out against Bacteria (taxid: 2), Plants (taxid: 3193), and Insects (taxid: 6960). The top 10 unique hits from each category were included in the phylogenetic anlaysis (Supplementary Table S3). All sequences were aligned using MUSCLE (Multiple Sequence Comparison by Log-Expectation) version 3.8.31 (Edgar 2004), and maximum-likelihood phylogeny estimated using PhyML 3.0 based on an Approximate Likelihood-Ratio Test (Anisimova and Gascuel 2006). The phylogenetic tree was drawn using the tree viewer, TreeDyn 198.3 (Chevenet et al. 2006).

Results

Identification of M. destructor UGTs

Using the amino acid sequences from D. melanogaster, An. gambiae, and Ae. aegypti UGTs as queries we identified a total of 13 putative UGT sequences from the M. destructor (GP biotype) assembled genome. Table 1 provides the ids for the putative UGTs identified from Hessian fly genome that have been designated as MdesUGT1 to MdesUGT13. Three of the UGTs (MdesUGT2, MdesUGT8, MdesUGT11) were located on chromosome A1, two (MdesUGT6, MdesUGT10) on A2 and one each on chromosomes X1 (MdesUGT1) and X2 (MdesUGT7). Six UGT genes (MdesUGT3, MdesUGT4, MdesUGT5, MdesUGT9, MdesUGT12, MdesUGT13) were unmapped and on scaffolds (Table 1). Blast searches of JBrowse predicted protein sequences for the Hessian fly UGT genes against NCBI nr database revealed that except for MdesUGT10 for which the top blast hit was with UGT identified from Anopheles darling, the remaining UGTs from M. destructor showed high sequence similarity with that of UGTs from Contarinia nasturtii (swede midge). InterPro and Pfam domain searches with the JBrowse predicted protein sequences revealed that MdesUGT genes are members of the UDP-glycosyltransferase family. NCBI conserved domain database also clearly showed that all sequences of MdesUGT genes contained the glycosyltransferase domain. Except for MdesUGT10 sequence for which no specific family/subfamily was assigned, all other MdesUGT genes were assigned to a specific UGT family and subfamily based on the description of the top hit (Table 1).

Table 1 UGT genes from M. destructor genome

Expression profile of M. destructor UGT genes in virulent GP larvae

To determine the mRNA expression profile of the 13 putative UGT genes (MdesUGT1 to MdesUGT13) identified from the Hessian fly genome, qRT-PCR was employed using first- (1 and 3 DAH) and second-instar (9 DAH) GP larvae feeding on Newton (susceptible) wheat. Except for MdesUGT12 where no transcripts were detected at the 1 and 3 DAH time points, all other UGT genes identified were differentially expressed during Hessian fly larval developmental stages (Fig. 1). All UGT genes had basal levels of expression in the neonates (Supplementary Fig. S1). Expression of four genes, MdesUGT1, MdesUGT4, MdesUGT6 and MdesUGT10 showed significant increase in both the first- and second-instars as compared to neonates (Fig. 1). While three of the four UGT genes (MdesUGT4, MdesUGT6 and MdesUGT10) showed a small increase in transcripts ranging from 2.1 to 3.6-folds, MdesUGT1 increased dramatically (Fig. 1). At 1 DAH, MdesUGT1 expression increased to threefold (p = 0.0049). However, by 3 and 9 DAH the transcripts accumulated dramatically to significantly higher levels of 17.1-fold (p = 0.0012) and 360.7-fold (p < 0.0001), respectively, as compared to neonates. Five genes, MdesUGT2, MdesgUGT5, MdesUGT7, MdesUGT11 and MdesUGT13 were significantly down-regulated in both the first- and second-instars. For these genes, the transcript levels decreased significantly in the second-instars (9 DAH) as compared to first-instar larvae ranging from 39.9-fold (p < 0.0001), 22.4-fold (p < 0.0001), 20.4-fold (p < 0.0001), 18.9-fold (p < 0.0001), and 2.8-fold (p < 0.0001) for MdesUGT11, MdesUGT13, MdesUGT5, MdesUGT7 and MdesUGT2, respectively (Fig. 1). Three of the UGTs (MdesUGT3, MdesUGT8 and MdesUGT9) exhibited variable expression patterns. MdesUGT8 and MdesUGT9 were not significantly differentially expressed at 1 DAH, but the former showed a significant decrease in transcript levels of 6.2-fold (p < 0.0001) at 3 DAH and 1.9-fold (p = 0.0045) at 9 DAH. MdesUGT3 transcripts were significantly down-regulated at 1 DAH (1.3-fold; p = 0.0120), unchanged at 3 DAH, and significantly up-regulated by 9 DAH (2.3-fold; p = 0.0007).

Fig. 1
figure 1

Differential expression of UGT genes in Hessian fly. Heatmap depicts expression profiles of 13 genes identified from M. destructor genome encoding UGT enzymes in first- (1 and 3 DAH, days after egg hatch) and second-instar (9 DAH) developmental stages in GP larvae feeding on susceptible Newton wheat. Log2 fold change values, as determined by qRT-PCR, are shown within each cell of the heatmap. Blue represents up-regulated genes, red represents down-regulated genes, and white represents genes that are not differentially regulated as compared to the neonates. *indicates genes that are not significantly differentially expressed at those respective time points as compared to the neonates

Expression of MdesUGT1 during compatible and incompatible host wheat-Hessian fly interactions

The mRNA expression patterns of MdesUGT1 were also profiled in Biotype L larvae feeding on resistant Molly, harboring the H13 resistance gene (incompatible interaction), and susceptible Newton, lacking a resistance gene (compatible interaction), wheat lines during various stages of Hessian fly larval developmental stages (first-, second-, and third-instars) using qRT-PCR. Neonates had basal levels of MdesUGT1 expression (Supplementary Fig. S1). During incompatible interactions, there was no significant differential expression of MdesUGT1 in the first-instars (1 and 3 DAH) as compared to the neonates (Fig. 2a). However, during compatible interactions, there was significant increase in differential expression of MdesUGT1 in Biotype L larvae (Fig. 2b) as compared to the neonates, similar to the expression profile observed in GP larvae (Fig. 1). The transcripts of MdesUGT1 increased as high as 71.1-fold (p < 0.0001) and 317.8-fold (p < 0.0001) in the first-instars at 1 and 3 DAH, respectively (Fig. 2b). Resembling the expression profile observed for GP larvae (Fig. 1), peak expression of MdesUGT1 in Biotype L larvae was observed in the second-instars by 9 DAH, increasing as high as 926-fold (p = 0.0015) and gradually decreasing to 283.3-fold (p < 0.0001) by 14 DAH (Fig. 2b) compared to the neonates. A relatively modest increase in expression (18.8-fold; p < 0.0001) was also observed by 21 DAH in the virulent larvae (Fig. 2b).

Fig. 2
figure 2

Temporal expression of MdesUGT1. a Avirulent Biotype L Hessian fly larvae feeding on Molly (resistant) host wheat 1 and 3 DAH (days after egg hatch); b Virulent Biotype L Hessian fly larvae feeding on Newton (susceptible) host wheat 1, 3, 9, 14 and 21 DAH; c Biotype L Hessian fly larvae feeding on nonhost Brachypodium distachyon at 3 and 9 DAH. Fold change values, as determined by qRT-PCR, are presented. Statistically significant (p < 0.05) differences are indicated by “*”. Error bars represent data from three biological replicates (n = 3; each with three technical replicates)

Expression of MdesUGT1 in Hessian fly larvae feeding on nonhost B. distachyon

The expression profile of MdesUGT1 was also determined in Biotype L larvae feeding on nonhost Bd plants (Fig. 2c). The expression patterns resembled those observed in virulent Biotype L and GP larvae feeding on susceptible host wheat (Fig. 2c). MdesUGT1 transcripts accumulated to 72.6-fold (p < 0.001) in the first-instars (3 DAH), and further dramatically increased to 471.5-fold (p = 0.0149) in the second-instars (9 DAH), as compared to the neonates (Fig. 2c).

Isolation and characterization of MdesUGT1 from Hessian fly

Since the qRT-PCR expression profile of MdesUGT1, unlike the other UGT genes, revealed significantly elevated levels of accumulation of mRNA transcripts in both Biotype L and GP larvae during compatible interactions (susceptible wheat, virulent larvae) at various developmental stages, we proceeded to further clone and characterize MdesUGT1. The near full-length cDNA clone of MdesUGT1 (GenBank accession number MT985983, https://www.ncbi.nlm.nih.gov/search/all/?term=MT985983) isolated from Biotype L larvae was 1663-bp long including a 5′ untranslated region (UTR) of 44 bp, an open reading frame (ORF) of 1542 bp, and a 3′ UTR of 77 bp (Fig. 3a). The ORF encoded a predicted polypeptide of 513 amino acids with a predicted molecular mass of 59.26 kDa and isoelectric point of 8.88. A predicted poly (A) tail was observed downstream of the gene at the end of the 3′ UTR at position 1639 bp. The genomic DNA sequence of MdesUGT1 (GenBank accession number MT965693, https://www.ncbi.nlm.nih.gov/nuccore/MT965693), revealed the presence of 6 introns at positions 4 bp, 591 bp, 886 bp, 1182 bp, 1713 bp, and 1868 bp of the ORF, bringing the sequence length to 2338 bp (Fig. 3a). The near full-length cDNA clone of MdesUGT1 isolated from GP (GeneBank accession number MT997184, https://www.ncbi.nlm.nih.gov/search/all/?term=MT997184) Hessian fly larvae is 1504 nucleotides long including a 5′ and 3′ UTR of 26 and 14 nucleotides, respectively, and an ORF of 1464 nucleotides. The ORF encoded a predicted protein of 487 amino acids with a molecular mass of 56.19 kDa and isoelectric point of 8.37. Alignment of the nucleotide and amino acid sequences for MdesUGT1 homologs from Biotype L and GP larvae revealed 98.2% and 98.9% identity, respectively. MdesUGT1 from Biotype L and GP shares 76.7% and 72.3% amino acid sequence identity with a gene encoding a putative UDP-glycosyltransferase 1–7-like protein from the dipteran insect pest Contarinia nasturtii. The sequence of MdesUGT1 cloned from Biotype L larvae was submitted for naming to the UGT nomenclature committee (https://prime.vetmed.wsu.edu/resources/udp-glucuronsyltransferase-homepage) and was designated as UGT301F1.

Fig. 3
figure 3

MdesUGT1 sequence characterization. a Schematic representation of coding (cDNA) and genomic (gDNA) sequence of MdesUGT1. The introns are represented by rectangular, gray, numbered boxes, and exons are marked between. Start (ATG) and stop (TGA) codons, and beginning and ends of introns, are marked with nucleotide numbers. b Protein sequence of MdesUGT1 showing three conserved motifs of UGT proteins. Amino acids highlighted in gray form the UGT signature motif, while conserved sugar donor-binding regions (DBR) are shown with double underlined amino acids for DBR1 (dashed lines) and DBR2 (solid lines). Single underlined (thin black line) and italicized amino acids represent the putative signal peptide with the cleavage site marked by a filled triangle. Gray underlined (thick line) amino acids represent 4 transmembrane helices. c Secondary structure of MdesUGT1 with the N-terminus (N) and C-terminus (C) marked. Rossmann folds in the N-terminus (white font β strand-α helix-β strand) and C-terminus (black font β strand-α helix-β strand) are marked

Analysis of MdesUGT1 sequence with Pfam Domain program predicted a conserved UDP-glycosyltransferase domain between amino acid positions 20 to 512. SignalP program predicted a putative N-terminal secretory signal peptide sequence of 18 amino acid residues (Fig. 3b). MdesUGT1 showed the presence of a UGT signature motif that was 29 amino acids long with the consensus sequence FISHGGMSGTYEGVARGVPFLFSPLFADQ in the C-terminal region of the protein (Fig. 3b). MdesUGT1 sequence also contained two conserved sugar donor-binding regions, DBR1 and DBR2 (Fig. 3b). Additionally, four transmembrane helices were predicted in MdesUGT1 that are located at amino acid positions 8–25, 132–159, 198–213, and 483–510 (Fig. 3b).

Phyre2 was used to predict the secondary structure of MdesUGT1 (Fig. 3c). The three-dimensional structure of MdesUGT1 was modeled using the template structure of D2VCHA1, a protein belonging to the UDP-glycosyltransferase/glycogen phosphorylate superfamily, with which 16% identity is shared. 82% (421 amino acid residues) of the MdesUGT1 sequence was modeled with 100% confidence using the reference template. The three-dimensional structure also contained two Rossman fold-like domains, composed of alternating motifs of beta strand-alpha helix-beta strand (βαβ-fold), in the N-terminus and C-terminus regions of MdesUGT1 (Fig. 3c).

Relationship of MdesUGT1 to orthologous insect UGTs

To understand the phylogenetic relationship of the Hessian fly MdesUGT1, a maximum-likelihood phylogenetic tree was constructed with UGT orthologs identified from other insect, plant and bacterial species (Supplementary Table S2). The phylogenetic tree revealed clustering of UGT sequences from these groups into three distinct clades with bootstrap support (Fig. 4). Clades I, II and III are composed of UGTs from bacteria, insects and plants, respectively. MdesUGT1 clustered in clade II and grouped along with UGTs from other insects. Within insects, MdesUGT1 grouped in a subclade along with putative UGT from Contarinia nasturtii (swede midge), also belonging to the order Diptera. While MdesUGT1 shared 73% sequence identity with the swede midge UGT, it only showed around 48% identity with other insect UGTs that grouped into other subclades. The sequence of MdesUGT1 was very distant from UGTs in clades composed of bacterial (clade I) and plant (clade III) UGTs, sharing only 25–45% sequence identity.

Fig. 4
figure 4

Phylogenetic analysis of MdesUGT1. A maximum likelihood tree showing relationship of MdesUGT1 (yellow box) with UGT orthologs derived from other plants, insects and bacteria. The NCBI accession numbers for these sequences are provided in Supplementary Table S1. Bootstrap values (1000 replicates) greater than 50% are shown next to the branches. Scale bar represents branch lengths

Discussion

In the current study, taking advantage of the recently assembled M. destructor (GP biotype) genome, we identified a total of 13 (MdesUGT1 to MdesUGT13) genes that encode for UDP-glycosyltranferase enzyme. Consistent with UGTs documented from other organisms, the Hessian fly genome-predicted UGTs contain the glycosyltransferase domain of the UGT family, and are, hence, putative UGTs. Recent whole genome sequence has revealed a diverse number of UGT genes in several dipteran and lepidopteran insect species. The number of putative UGT genes (13) in M. destructor genome is comparable with that in A. gambiae (12; Huang et al. 2008) and Spodoptera littoralis (11; Bozzolan et al. 2014), but much smaller than that in D. melanogaster (33; Luque and O’Reilly 2002) and Bombyx mori (42; Huang et al. 2008). In general, the number of insect UGTs are much smaller than those reported in plants and animals, with numbers reaching as high as 120 members (Li et al. 2001).

Expression analysis of the 13 UGT genes revealed varying patterns of expression during first- and second-instar larval developmental stages in virulent GP biotype larvae feeding on susceptible Newton wheat. While 4 UGT genes were significantly up-regulated, 5 genes were significantly down-regulated in both first- and second-instars. Three of the genes showed variable expression patterns in first- and second-instar larvae while one was not expressed. Of the four genes that showed increase in transcripts, MdesUGT1 exhibited dramatically high levels of transcripts increasing as high as 360-fold (p < 0.0001) by 9 DAH time point as compared to the neonates, thus suggesting a possible role of MdesUGT1 during compatible interactions. While it is understood that increased transcript levels are not necessarily indicative of increased protein levels, it is likely that such high levels of MdesUGT1 transcripts would be translated into at least moderately increased protein levels, as well. Future studies with antibodies raised against the protein could demonstrate the actual increase in MdesUGT1 levels in the larvae. Further analysis of MdesUGT1 expression in Biotype L larvae, one of the most virulent Hessian fly biotypes (Shukle et al. 2016), feeding on susceptible Newton wheat also revealed similar dramatic increase in transcripts in the feeding stages (first- and second-instars) as observed for GP Hessian fly biotype. The transcripts increased as high as 71- and 317-folds by 1 and 3 DAH time points, and increased dramatically to > 920-folds by 9 DAH time point, as compared to the neonates. In contrast, MdesUGT1 was not significantly differentially expressed during the incompatible interactions where the larvae are avirulent and die within a few days of feeding on the resistant host wheat. This dramatic increase in expression to several hundred folds in the feeding instars of virulent larvae suggests that MdesUGT1 may be involved in detoxifying plant allelochemicals in addition to potential roles in other physiological processes. This defense against plant toxins possibly allows the virulent larvae to grow and complete their development feeding on susceptible host plant, unlike the avirulent larvae, feeding on resistant host wheat, that lack increase in expression of MdesUGT1. In susceptible wheat infested with Hessian fly biotype L, expression levels of phenylalanine ammonia-lyase (PAL), a key enzyme in the biosynthesis of lignin, phenylpropanoids and phytoalexins (Mauch-Mani and Slusarenko 1996), increase by around threefold compared to uninfested wheat, by 1 DAH (Sardesai et al. 2005). Unlike the virulent larvae, the lack of increased MdesUGT1 expression in the avirulent larvae suggests that these larvae are unable to survive the wheat toxins (Khajuria et al. 2013; Subramanyam et al. 2019) released by the resistant host, and hence, die within a few days after initiating the attack. In addition to allelochemical detoxification, MdesUGT1 may also have additional roles in cuticular development as suggested by the persistence and increased expression in 9 DAH virulent larvae. By this time, susceptible wheat plants show increased levels of tyrosine, not seen at early stages, in the nutritive tissue cells that the larvae feed on (Saltzmann et al. 2008). Tyrosine metabolism leading to the production of diphenolic compounds plays a pivotal role in the synthesis of structural proteins and tanning precursors for the development of new cuticle (Kramer and Hopkins 1987), followed by sclerotization (Hopkins and Kramer 1992) for pupation. Resembling the host wheat, similar expression profile of MdesUGT1 was also observed in Biotype L larvae feeding on nonhost Bd plants, where the expression levels increased as high as 72- and 471-folds in the feeding instars. Unlike host wheat, Bd exhibits intermediate phenotypic and molecular response to Hessian fly larval attack (Hargarten et al. 2017; Subramanyam et al. 2019). While some larvae die within 5 DAH at the first-instar larval stage resembling the resistant wheat, and a few larvae develop into the second-instar (small white larvae) developmental stage resembling the susceptible wheat, none of them complete their development (Hargarten et al. 2017). MdesUGT1 may also be possibly playing a role in allowing some larvae to withstand the nonhost defense toxins and develop via new cuticular formation, although none of them pupate on Bd plants.

Similar selective expression of genes encoding UGT and other biotransformation enzymes, in response to specific inducers in the larval diet, have been reported in several insects including the Hessian fly. BmUGT010286, a UGT identified from Bombyx mori is involved in the detoxification of allelochemicals from the mulberry tree (Morus alba) (Huang et al. 2008). B. mori larvae uptake flavonoids into their cocoons from leaves of mulberry tree and glucosylation of the flavonoids is the pathway employed by the insect to eliminate these dietary flavonoids (Fujimoto et al. 1972). Another UGT gene from B. mori, BmUGT1, is induced to higher levels in response to flavonoids and coumarins (Luque et al. 2002). Increase in expression of genes encoding other biotransformation enzymes, including glutathione S-transferases and cytochrome P450s, have also been documented in feeding instars of virulent Hessian fly larvae during compatible interactions (Mittapalli et al. 2005, 2007). The mRNA levels for MdesGST-3, a gene encoding glutathione S-transferase, were found to be significantly higher in the Hessian fly feeding instars and proposed to play a key role in detoxifying xenobiotics including insecticides and plant allelochemicals (Mittapalli et al. 2007). Similarly, CYP6AZ1, a gene belonging to another class of detoxifying enzymes, the cytochrome P450s, was induced during active Hessian fly larval feeding (Mittapalli et al. 2005). CYP6AZ1 is midgut-specific and its expression is influenced by chemicals present in the host wheat plant and suggested to be required for larval feeding during compatible interactions. Larval feeding triggers elevated expression of select phenylpropanoid pathway genes, involved in the biosynthesis of secondary metabolites, in both Hessian fly-infested host wheat (Sardesai et al. 2005; Khajuria et al. 2013) and nonhost Bd (Subramanyam et al. 2019). These findings suggest that the early increased expression of MdesUGT1 during active larval feeding complements the previously documented detoxification component of Hessian fly and may be essential for the biotransformation of host and nonhost derived secondary metabolites encountered by the larvae.

Molecular characterization of the MdesUGT1 amino acid sequence revealed presence of predicted N-terminal signal peptide, conserved signature motif, and C-terminal transmembrane domains characteristics shared by UGT sequences (Ahn et al. 2012; Mackenzie et al. 1997). MdesUGT1 is a member of the UGT301 family that is also found in the Diptera Anopheles sinensis, An. gambiae, Ae. aegypti, and D. melanogaster (Ahn et al. 2012). Presence of the signature motif as well as the binding regions is essential for binding of UDP moiety to the nucleotide sugars. UGTs are membrane-bound proteins that are localized in the endoplasmic reticulum (ER). The signal peptide mediates the integration of the protein precursor into the ER compartment (Meech and Mackenzie 1997) which is then subsequently cleaved. The protein is N-glycosylated, and the mature protein is retained in the ER membrane by its hydrophobic transmembrane domains (Ahn et al. 2012). MdesUGT1harbors similar motifs, resembling other insect UGTs, possibly making it functionally competent to catalyze the glucosidation process.

Conclusion

UGTs constitute an important class of biotransformation enzymes in insects, with their primary function being the detoxification of ingested plant toxins (Ahmad and Hopkins 1992; Ahn et al. 2011; Krempl et al. 2016) and cuticular development (Kramer and Hopkins 1987; Hopkins and Kramer 1992) among others. The present study, to the best of our knowledge, is the first report that provides an overview of UGTs from M. destructor genome. Further, the preferential early expression of MdesUGT1 in virulent Hessian fly larvae, suggests that it may be involved in defending against plant toxins during compatible host and nonhost interactions, in addition to playing roles in processes including cuticle formation at later stages. Future studies, characterizing the functional role of MdesUGT1 via enzymatic and mutational assays will provide critical information on their involvement in defending against plant allelochemicals, development mechanisms, and a better understanding of the Hessian fly-plant interaction, leading to efficient management strategies against this and other dipteran insect pests.