Genomic and functional genomics analyses of gluten proteins and prospect for simultaneous improvement of end-use and health-related traits in wheat

Key message Recent genomic and functional genomics analyses have substantially improved the understanding on gluten proteins, which are important determinants of wheat grain quality traits. The new insights obtained and the availability of precise, versatile and high-throughput genome editing technologies will accelerate simultaneous improvement of wheat end-use and health-related traits. Abstract Being a major staple food crop in the world, wheat provides an indispensable source of dietary energy and nutrients to the human population. As worldwide population grows and living standards rise in both developed and developing countries, the demand for wheat with high quality attributes increases globally. However, efficient breeding of high-quality wheat depends on critically the knowledge on gluten proteins, which mainly include several families of prolamin proteins specifically accumulated in the endospermic tissues of grains. Although gluten proteins have been studied for many decades, efficient manipulation of these proteins for simultaneous enhancement of end-use and health-related traits has been difficult because of high complexities in their expression, function and genetic variation. However, recent genomic and functional genomics analyses have substantially improved the understanding on gluten proteins. Therefore, the main objective of this review is to summarize the genomic and functional genomics information obtained in the last 10 years on gluten protein chromosome loci and genes and the cis- and trans-factors regulating their expression in the grains, as well as the efforts in elucidating the involvement of gluten proteins in several wheat sensitivities affecting genetically susceptible human individuals. The new insights gathered, plus the availability of precise, versatile and high-throughput genome editing technologies, promise to speed up the concurrent improvement of wheat end-use and health-related traits and the development of high-quality cultivars for different consumption needs.


Introduction
Wheat (Triticum aestivum) is the most widely cultivated staple food crop in the world, providing approximately 20% of the total dietary calories and proteins globally and a wealth of additional health promoting nutrients to the daily human diet Shiferaw et al. 2013;Shewry and Hey 2015). Owing to its importance, global wheat production has risen significantly from 440.2 million tons in 1980 to 771.7 million tons in 2017 (FAOSTAT, http://www.fao. org/faost at/en/). As world population is expected to reach 9.8 billion by 2050, food production has to increase by at least 50% compared to the current level (FAO 2017). Furthermore, consumer demand of healthier foods is increasing worldwide because of rising living standards. Consequently, substantial efforts have to be devoted to improve the yield 1 3 and quality traits of major agricultural crops including wheat.
Traditionally, wheat quality refers mainly to the enduse properties of the flour (Delcour et al. 2012;Gras et al. 2001;Rasheed et al. 2014;Wrigley et al. 2009). This aspect has been extensively studied since the first description of wheat gluten in 1745 (Ma et al. 2019;Shewry 2019). Typical wheat gluten proteins include several families of structurally similar and yet distinctive prolamin proteins, i.e., high and low molecular weight glutenin subunits (HMW-GSs and LMW-GSs) and gliadins (Table 1), with similar proteins present in related Triticeae species such as rye and barley (Shewry et al. 2003). The gluten proteins in each of the families generally have two or more types (Table 1). In HMW-GSs, there exist x-and y-type subunits; in LMW-GSs, i-, m-and s-types are found; for gliadins, α-, γ-, δand ω-types are differentiated. A common structural feature shared by different glutenins and gliadins is the presence of a repetitive domain composed of repetitive motifs rich in glutamine (Q) and proline (P) residues (Table 1). Another important characteristic shared by the gluten proteins is that they are specifically expressed in the developing grains and accumulate to relatively high amounts in the endospermic tissues (Shewry et al. 2003); in wheat cultivars, gluten proteins generally account for ~ 80% of the total grain proteins (Altenbach 2017).
At the desiccation stage of wheat grain development, HMW-GSs and LMW-GSs form glutenin polymers through intra-and inter-molecular disulfide bonds (Table 1), with HMW-GSs as backbone and LMW-GSs as branches (Naeem and MacRitchie 2005;van Herpen et al. 2008a;Wrigley et al. 2009). During dough processing, glutenin polymers interact with monomeric gliadins and other proteins to form gluten polymers of various sizes, thus conferring viscoelasticity to the dough (MacRitchie 2014). Among the gluten polymers, only those with a molecular mass ≥ 250 kDa contribute significantly and positively to dough functionality and end-use properties (Bangur et al. 1997;Tronsmo et al. 2002). Depending on method of analysis, the largesized gluten polymers can be prepared as glutenin macropolymers (GMPs) or unextractable polymeric protein (UPP) complexes (Don et al. 2003a, b;Gupta et al. 1993). Don et al. (2006) showed that GMPs and UPP complexes are highly and positively correlated in both quantity and effects on dough functionality. These data, together with the findings from many genetic studies, support a model that glutenins and gliadins are the main determinants of dough viscoelasticity and thus the end-use properties of wheat flour. Compared to gliadins, HMW-GSs and LMW-GSs contribute more significantly to both dough elasticity and extensibility. In addition to glutenins and gliadins, recent studies suggest that farinins (b-type avenin-like proteins) and purinins (LMW gliadins) may also be regarded as gluten proteins because of their participation in gluten polymers and influences on dough functionality and end-use quality (Kasarda et al. 2013;Shewry 2019) (see below).
Apart from functioning in end-use quality, gluten proteins are also known to be involved in a number of wheat food sensitivities, including celiac disease (CD), IgE-mediated wheat allergy and nonceliac wheat sensitivity (NCWS) (Cabanillas 2019;Scherf et al. 2016). Based on current understanding, these diseases may be defined as adverse reactions to gluten Table 1 Main characteristics of wheat gluten proteins Based on the data presented in D' Ovidio and Masci (2004), She et al. (2011), Shewry (2019) and Wieser (2007) a According to Anderson et al. (2012) and Wan et al. (2013) b According to Schalk et al. (2017) c Although most ω-gliadins do not contain cysteine, a few expressed ω-gliadins have recently been found to carry one cysteine in their deduced proteins (Vensel et al. 2014;Wang et al. 2017) d Some of the α-, ω-and γ-gliadins carrying an odd number of cysteine residues may be incorporated into glutenin polymers through disulfide bonding (Ferrante et al. 2006;Vensel et al. 2014 1 3 and related proteins by the immune system of genetically susceptible human individuals. CD is an autoimmune disease, with an incidence of 1-3% in the human population (Lionetti et al. 2015). The T cell epitopes triggering CD have been detected in a variety of gluten proteins from wheat and similar proteins in rye and barley, with the most immunogenic and toxic types located in wheat α-and ω-gliadins (Juhász et al. 2018;Scherf et al. 2016). CD causes damage to the small intestine and results in a plethora of symptoms including malabsorption of nutrients (Lionetti et al. 2015).
Wheat-dependent exercise-induced anaphylaxis (WDEIA) and baker's asthma are two commonly encountered IgEmediated wheat allergies, with the incidences estimated to range from 0.33 to 1.17% (Cabanillas 2019). The ω-5 gliadins and HMW-GSs are major allergens associated with WDEIA (Altenbach et al. 2018;Matsuo et al. 2004Matsuo et al. , 2005. However, many other wheat grain proteins, e.g., α-amylase/ trypsin inhibitors (ATIs) and nonspecific lipid transfer proteins (nsLTPs), may also be involved in IgE-mediated wheat allergy (Cabanillas 2019;Juhász et al. 2018). NCWS is a recently described wheat food-dependent distress, with an estimated prevalence of 0.16-13% Cabanillas 2019). The symptoms of NCWS may resemble those of CD, but do not involve damages to the intestine. There is no evidence for the involvement of autoimmunity or IgE-mediated reaction in NCWS; instead, activation of the innate immune system may play a role in this condition (Cabanillas 2019). The pathogenesis of NCWS is still poorly understood. It is likely that wheat ATIs and certain gluten protein components (e.g., gliadins) may be involved in triggering the disorder (Cabanillas 2019;Zevallos et al. 2017). Therefore, a major challenge in wheat quality research is to enhance the end-use properties of grains while minimizing the immunogenic potential of gluten proteins (Altenbach 2017; Shewry and Tatham 2016;). To tackle this challenge effectively, a sound understanding of the expression, accumulation and genetic variation of gluten proteins is needed. Conventional methods, such as sodium dodecylsulphate-polyacrylamide gel electrophoresis (SDS-PAGE) and high-performance liquid chromatography (HPLC), although useful for characterizing HMW-GSs that have fewer members expressed in wheat, are insufficient for high resolution analysis of LMW-GSs and gliadins. This is because LMW-GSs and gliadins are generally expressed from multigene families with some members being highly similar in sequence, molecular size and expression profiles (Altenbach 2017; Shewry et al. 2003). This problem also complicates the matching of different LMW-GSs and gliadins accumulated in the grains to their corresponding genes and transcripts (Dupont et al. 2011). However, with the advent of structural and functional genomics and the availability of genomic information for wheat in recent years (IWGSC 2018;Uauy 2017), the difficulties outlined above are largely relieved, and our understanding on gluten protein expression profiles and functions has been substantially improved over the last 10 years. Consequently, the main objective of this review is to summarize the progress made in the genomic and functional genomics analyses of wheat gluten proteins. The prospect for simultaneously improving wheat end-use and health-related traits by genomic approaches will also be briefly discussed.

Genomic analysis of gluten quality-related chromosomal loci and genes
Although it has long been known that the genes encoding glutenins and gliadins are carried in complex chromosomal loci (Shewry et al. 2003), only recently have systematic efforts been made to elucidate the organization of these loci by using the genome sequence information obtained for the hexaploid wheat variety Chinese Spring (CS) and closely related diploid and tetraploid wheat species (Avni et al. 2017;IWGSC 2018;Ling et al. 2018;Luo et al. 2017;Zhao et al. 2017). The homoeologous Glu-1 loci (Glu-A1, -B1 and -D1), carrying HMW-GS genes and located on the long arms of group 1 chromosomes, are relatively simple. Two paralogous HMW-GS genes, encoding one x and one y subunit, respectively, exist in each Glu-1 locus, with the two paralogs separated by approximately 52-180 kb (Gu et al. 2006). The intergenic space of the two HMW-GS genes carries transposon elements as well as two genes predicted to encode a globulin and a protein kinase, respectively; immediately upstream of the x-type HMW-GS gene resides another globulin gene and a putative receptor kinase gene.
The three homoeologous composite loci carrying Gli-1 and Glu-3 (Gli-A1/Glu-A3, Gli-B1/Glu-B3 and Gli-D1/Glu-D3), located on the short arms of group 1 chromosomes, are highly complex. Recent genomic studies in CS and the D genome donor species Aegilops tauschii (Aet) show clearly that Gli-1 and Glu-3 are physically linked, with Gli-1 located upstream of Glu-3 (Dong et al. 2016;Huo et al. 2018a). The precise physical size of a Gli-1/Glu-3 composite locus is unknown at present, but is likely larger than 2 Mb. Based on the data gathered from CS and Aet, in each Gli-1 region, there are a number of genes coding for γ-gliadins (4-5), δ-gliadins (1-2) or ω-gliadins (3-8). In each Glu-3 region, there are 4-7 LMW-GS genes. There are also a few LMW-GS genes located outside of the main Glu-3 region, probably resulting from translocation events. Another prominent feature shared by Gli-1/Glu-3 composite loci is the presence of multiple copies of predicted receptor-like kinase genes and genes encoding the NLR proteins with nucleotide-binding domain and leucine-rich repeats. The genes specifying γ-or δ-gliadins are usually clustered together, so are those coding for ω-gliadins, but those encoding LMW-GSs are frequently separated by one or more NLR genes. In addition, a couple of syntenic ancestral genes are conserved among homoeologous Gli-1/Glu-3 loci, which divide the genomic regions into four blocks, with blocks 1 and 2 encompassing Gli-1 and blocks 3 and 4 covering Glu-3.
Genomic insight has also been gained into the three α-gliadin chromosomal loci (Gli-A2, -B2 and -D2) located on the short arms of group 6 chromosomes (Huo et al. 2017(Huo et al. , 2018b. In CS and Aet, the analyzed genomic regions carrying Gli-A2, -B2 or -D2 range from 387 to 836 kb, with the copy number of α-gliadin genes in the three loci varying from 12 to 24. Gli-A2, -B2 or -D2 regions are flanked by glutamate receptor-like (GRL) genes, with two GRL members at the 5′ end and one at the 3′ end; an internal insertion of another GRL member divides each Gli-2 region into two subregions. Unlike Gli-1/Glu-3 regions, Gli-2 loci are less interrupted by non-prolamin genes. The structure of the Gli-D2 locus in a Chinese wheat cultivar Xiaoyan 81 (Xy81) is similar but not identical to that present in CS and Aet. Several α-gliadin gene members present in CS and Aet are deleted in Xy81. However, two α-gliadin genes in Xy81 Gli-D2 are each duplicated once, thus maintaining a total of 10 such genes ). These data demonstrate allelic variation of Gli-D2 among different wheat materials, which may also happen to Gli-A2 and -B2.
In addition to glutenins and gliadins, a number of studies have reported the expression of avenin-like proteins (ALPs) in wheat grains and increasing evidence on their influence of dough functionality. The transcripts for two types of ALPs (avenin-like a and avenin-like b) were originally discovered in the analysis of differentially expressed storage protein transcripts in Aegilops and wheat seeds (Kan et al. 2006). Type-a ALPs have a molecular mass of ~ 18 kDa and carry 14 conserved cysteine (cys) residues in their deduced proteins. On the other hand, type-b ALPs possess either 18 or 19 cys residues and have a molecular mass around 34 kDa. The two types of ALPs were renamed as farinins and purinins, respectively, by Kasarda et al. (2013). The larger molecular mass of type-b ALPs is mainly due to the duplication of an internal cys-rich domain of ~ 120 amino acids. Type-a ALPs are related to the LMW gliadins reported previously and may not be incorporated into the gluten polymers (Kasarda et al. 2013). In contrast, type-b ALPs have been detected in gluten polymers by both proteomic and transgenic studies (Kasarda et al. 2013;Ma et al. 2013a, b;Mamone et al. 2009;Vensel et al. 2014). The genes coding for ALPs are present in wheat and a wide range of Triticeae species (Chen et al. 2008(Chen et al. , 2016Kan et al. 2006). By searching the annotated genome sequence of CS, a total of 15 genes, six for type-a ALPs, six for type-b ALPs and another three for type-c ALPs, which represent a previously unrecognized class of ALPs, have been identified (Zhang et al. 2018a, b). These genes are located on chromosome arms 4AL, 7AS and 7DS, respectively, with five members (two for type-a, two for type-b and one for type-c ALPs) on each arm. Finally, evidence for the contribution of type-b ALPs to wheat dough functionality and end-use quality has been obtained by several studies (Chen et al. 2010(Chen et al. , 2016Ma et al. 2013a, b). Potential effects of other two types of ALPs on wheat gluten, dough and end-use properties remain to be determined.
The genomic organizations outlined above are obtained from only a limited number of genotypes. Variations to them are to be expected in wheat germplasm and closely related species, because the genomic regions carrying these highly complicated loci are subjected to independent and dynamic evolution (Gu et al. 2006;Huo et al. 2018a, b). Moreover, these loci are constantly modified by wheat breeding efforts because of their influences on end-use and health-related traits (Branlard et al. 2001;Dong et al. 2013Dong et al. , 2017Gras et al. 2001;Wrigley et al. 2009).

Genes regulating the expression of gluten proteins
Gluten proteins are specifically and primarily expressed in the endospermic tissues of developing wheat grains. Transcriptional regulation, brought about by intricate interactions between cis-and trans-acting factors, plays a key role in the control of gluten gene expression. There exist a large number of cis-elements in the promoter region of glutenin and gliadin genes. For example, the GCN4-like motif (GLM) and prolamin box (P-box), which are bound by basic leucine zipper (bZIP) and DNA binding with one finger (DOF) transcription factors (TFs), respectively, are present in the promoter regions of gluten genes, including HMW-GS, LMW-GS and α/β-gliadin genes (Albani et al. 1997;Dong et al. 2007;Juhász et al. 2011;Li et al. 2019;Noma et al. 2016;Ravel et al. 2014;She et al. 2011;Thomas and Flavell 1990;van Herpen et al. 2008b;Wang et al. 2013).
Two recent studies have provided substantial insights into the presence and function of conserved cis-regulatory modules (CCRM) in the promoters of HMW-GS genes (Li et al. 2019;Ravel et al. 2014). In the former study, wheat lines transformed with various promoter:GUS fusion constructs of a HMW-GS gene were developed and analyzed. The results showed that the 300 bp region, upstream of the translation initiation codon and carrying CCRM1 (− 300 to − 101), is sufficient for conferring endospermic expression of HMW-GS gene; the more upstream CCRMs, i.e., CCRM2 (− 650 to − 400) and CCRM3 (− 950 to − 750), enhance the expression of HMW-GS gene but have no effect on their expression specificity. More detailed analysis of the 300 bp basal promoter suggests that CCRM1-1 (− 208 to − 101) is indispensable for HMW-GS gene expression in the endosperm tissues, whereas CCRM1-2 (− 300 to − 209) is 1 3 required for the timely onset of HMW-GS gene expression in the endosperm. The CCRMs provide a general and useful framework for further dissecting the functions of different cis-elements in the transcriptional regulation of HMW-GS genes. However, it is worth noting that homoeologous and paralogous HMW-GS, LMW-GS and α/β-gliadin genes often show indel polymorphisms in their promoter regions, which result in differences in the numbers and types of ciselements contained (Geng et al. 2014;Juhász et al. 2011;Noma et al. 2016;Wang et al. 2013). Li et al. (2019) suggest that the CCRMs defined for HMW-GS gene promoters are not well conserved for LMW-GS and gliadin genes, indicating that differences exist in the cis-acting elements carried by the promoters of different types of gluten genes. This phenomenon was also observed by previous studies (Juhász et al. 2011;van Herpen et al. 2008b;Wang et al. 2013).
Considerable progress has also been made in the genomic analysis of trans-acting factors that affect gluten gene expression. In the study by Plessis et al. (2013), a diverse set of candidate genes encoding putative transcription factors (TFs), histones or chromatin modification proteins were found to significantly associated with the composition of glutenins and gliadins. Many of the associated genes are orthologs of the barley genes with demonstrated roles in regulating grain storage protein accumulation. A number of the associated TFs have been functionally confirmed to regulate the expression of glutenins and gliadins ( Table 2). The bZIP TFs SPA and SHP have been shown to promote and repress the transcription of HMW-GS and LMW-GS genes, respectively (Albani et al. 1997;Boudet et al. 2019;Ravel et al. 2009). Wheat prolamin binding factor (WPBF), a DOF TF, has been found required for efficient expression of LMW-GSs and gliadins in the grains (Dong et al. 2007;Moehs et al. 2019;Ravel et al. 2006). Another DOF TF, PBF-D, binds P-box element in the promoters of the HMW-GS genes Glu-1By8 and -1Dx2, and its overexpression can significantly increase the accumulation levels of HMW-GSs in the grains of transgenic wheat plants . Guo et al. (2015) characterized a regulatory module consisted of the MYB TF TaGAMyb and the histone acetyltransferase TaGCN5, which regulates the expression of the HMW-GS gene Glu-1Dy by establishing a histone H3 acetylation pattern conducive to active gene transcription. Sun et al. (2017) identified TaFUSCA3, which is a B3 domain-containing TF and can transactivate the promoter of the HMW-GS gene Glu-1Bx7 through binding to the ciselement RY repeat. Finally, the wheat DME gene encoding 5-methylcytosine DNA glycosylase is required for efficient expression of LMW-GS and gliadin genes by active demethylation of their promoters in developing wheat grains (Wen et al. 2012). Taken together, the available data suggest that transcription regulation of gluten genes involves complex interactions in between different cis-and trans-acting factors. Differences in these interactions may underlie variations in the expression patterns of individual gluten genes.

Genome-wide analysis of gluten gene transcription
Genome-wide analysis of gluten gene transcription can yield useful information on the expression profiles of different gluten gene members, which aids investigations of the functional importance of specific gluten proteins in the control of end-use and health-related traits. Genomic analysis of HMW-GS gene expression is straightforward because the number of genes involved is few and the homoeologous and paralogous members are relatively easy to differentiate. However, such analysis is quite difficult for LMW-GS and gliadin genes because of the presence 1 3 of multiple homoeologs and paralogs, many of which have high sequence similarity, in wheat genome. To date, three main approaches have been used to investigate genome-wide expression patterns of gluten genes. The first approach is based on the identification and analysis of expression sequence tags (ESTs) coupled with quantitative PCR assay of specific gene members. Using this approach, Kawaura et al. (2005) reported the expression of 36 α/β-gliadin and 15 LMW-GS genes in CS developing grains. In another wheat cultivar Butte, the expression of 5 HMW-GS, 22 LMW-GS, 23 α-gliadin, 13 γ-gliadin and 7 ω-gliadin genes was detected in the grains through EST analysis (Altenbach et al. 2010;Dupont et al. 2011). EST analysis has also revealed remarkable variations in the relative expression levels of α-gliadins specified by homoeologous Gli-2 loci among different wheat genotypes, which may facilitate the development of wheat lines with decreased content of α-gliadins ). The second approach uses oligonucleotide array hybridization to investigate the genes expressed during wheat grain development (Wan et al. 2008). By this approach, the genes encoding TaALPs (see above) and a new class of gliadin genes, corresponding to the δ-gliadins described by Anderson et al. (2012), were identified (Kan et al. 2006;Wan et al. 2013).
In the third approach, next-generation sequencing technologies, including Illumina HiSeq and PacBio long-read sequencing platforms, are employed to analyze gluten gene transcription in wheat grains. Long-read transcriptome sequencing, which can encompass the coding region of most eukaryotic transcripts, facilitates the identification and differentiation of the transcripts of homoeologous and paralogous gluten genes. The high-coverage short transcriptome reads yielded by HiSeq sequencing are useful for correcting the base errors associated with long-read sequencing and for estimating the expression levels of individual gluten gene members. With this approach, Dong et al. (2015) identified the transcripts for 6 HMW-GS, 14 LMW-GS, 32 α/βgliadin, 14 γ-gliadin and 6 ω-gliadin genes in the developing grains of Xy81, with intact open reading frame (ORF) found in 5 HMW-GS, 12 LMW-GS, 25 α/β-gliadin, 12 γ-gliadin (including 1 δ-gliadin) and 4 ω-gliadin gene members. In CS, the use of a similar approach identified the transcripts for 10 LMW-GS, 25 α/β-gliadin, 11 γ-gliadin, 2 δ-gliadin and 7 ω-gliadin genes in the grains (Huo et al. 2018a, b). In general, the transcript levels of the gluten genes differed widely (Huo et al. 2018a, b;Wang et al. 2017). For example, in the developing grains of CS examined at 20 days after anthesis, the transcript levels of 25 α-gliadin genes varied by as much as 206-folds based on the data of fragments per kilobase per million mapped reads; no transcript was detected for one α-gliadin gene (i.e., α-B17) despite it carried an intact ORF (Altenbach et al. 2019a, b). Together, the above studies show that the third approach is more powerful for transcriptomic analysis of gluten gene expression in wheat.

Proteomic analysis of gluten proteins
Despite the fact that gluten gene expression is primarily regulated at the transcriptional level, proteomic information is essential for understanding (1) accumulation levels of individual gluten gene members in grains, (2) roles of specific gluten proteins in gluten and dough functionalities and (3) effects of environmental factors and cultivation measures on gluten protein accumulation and function (Altenbach 2017; Ribeiro et al. 2013). The basic steps in gluten proteomic studies include separation of gluten proteins by two-dimensional gel electrophoresis (2-DE), excision and enzymatic digestion of protein spots from 2-DE gels, and identification of proteins by various types of mass spectrometry (MS) methods, such as tandem MS (MS/MS), electrospray ionization tandem mass spectrometry (ESI/MS/MS) and matrixassisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) (Dong et al. 2010;Dupont et al. 2011;Ferranti et al. 2007;Liu et al. 2010;Mamone et al. 2005Mamone et al. , 2009Muccilli et al. 2005). In addition to 2-DE, gel-free methods based on liquid chromatography (LC) have also been employed for separating protease-digested gluten proteins for MS analysis (Bromilow et al. 2017a;Colgrave et al. 2015;Fiedler et al. 2014;Schalk et al. 2017;Uvackova et al. 2013). Recently, Bromilow et al. (2017a) showed that LC-MS/MS analysis using a combination of QTOF (quadrupole time of flight) and LTQ (linear ion trap quadrupole) platforms is desirable for more comprehensive characterization of gluten proteins.

Proteomic insight into gluten protein accumulation in wheat grains
Owing to the multiplicity of gluten proteins and high sequence similarities among some LMW-GS or gliadin protein members, it has been difficult to distinguish between closely related gluten protein homologs in proteomic experiments. This problem is further aggravated by the fact that the enzyme trypsin commonly used for digesting proteome samples does not have many cleavage sites in gluten proteins, which is due to the presence of the repetitive motifs rich in glutamine and proline and the low percentages of arginine and lysine residues required for trypsin digestion in these proteins (Dupont et al. 2011). However, these difficulties are largely overcome by digesting each gluten protein sample separately with multiple proteases (e.g., trypsin, thermolysin and chymotrypsin) (Altenbach et al. 2010;Dupont et al. 2011). A further problem in gluten proteomic studies is to match accumulated gluten proteins to their corresponding encoding genes. This effort is likewise complicated by the high copy of gluten genes and strong sequence similarities among some gluten gene members. To tackle this problem, it is necessary to develop and use cultivar-specific gluten gene sequence database (Altenbach et al. 2010;Bromilow et al. 2017b). For this purpose, earlier studies used gluten gene transcripts reconstructed from ESTs or the genomic sequences of specific gluten genes determined from sequencing bacterial artificial clones (Altenbach et al. 2010;Dong et al. 2010). Subsequently, full-length gluten gene transcripts identified from PacBio long-read RNA sequencing data were employed for the matching . Recently, Altenbach et al. (2019a, b) used the annotated reference genome sequence information for matching gluten proteins to their cognate genes in CS.
Through the various efforts outlined above, valuable proteomic information has been obtained for the gluten proteins of a number of wheat cultivars. In Butte, the expression of 5 HMW-GSs, 22 LMW-GSs, 23 α-gliadins, 13 γ-gliadins and 7 ω-gliadins in the flour was detected (Dupont et al. 2011). In Xiaoyan 54, the accumulation of 11 LMW-GSs in the grains was supported by both transcriptional and proteomic data (Dong et al. 2010). In Xy81, a combination of transcriptomic and proteomic analyses revealed the accumulation of 38 gliadins in mature grains, which included 21 α-, 11 γ-, 1 δ-and 5 ω-gliadins . Cho et al. (2018) identified the expression of 23 α-gliadins, 11 γ-gliadins and 5 ω-gliadins in the grains of the Korean cultivar Keumkang, although no attempt was made to link the gluten proteins to their coding genes. In CS, six HMW-GS genes including two pseudogene members were characterized, with the four active members expressing 1Bx7, 1By8, 1Dx2 and 1Dy12 subunits accumulated in the grains (van den ). Through analyzing the reference genome sequencing data and conducting additional validation experiments, Huo et al. (2018a, b) identified a complete set of gliadin genes (including 47 for α-gliadin, 14 for γ-gliadin, 5 for δ-gliadin and 19 for ω-gliadins) and a total of 17 LMW-GS genes for CS; but the number of genes with intact ORF was found to be 26 for α-gliadins, 11 for γ-gliadins, 2 for δ-gliadins, 7 for ω-gliadins and 10 for LMW-GSs. Interestingly, proteomic analysis of CS flour samples identified the protein products for only 16 α-gliadin, 10 γ-gliadin, 1 δ-gliadin, 6 ω-gliadin and 9 LMW-GS proteins; for the gluten genes whose products were not found, they were either expressed at relatively low levels or their deduced products showed high similarities to other identified gluten members (Altenbach et al. 2019a, b). Wang et al. (2017) also failed to find protein products for the four α-gliadin genes with relatively low transcript levels in their analysis of Xy81 gluten proteins. On the other hand, alpha-D8 was the most highly accumulated α-gliadin in CS flour, although its transcript level was very low compared to those of highly transcribed α-gliadins in CS developing grains (Altenbach et al. 2019a, b). These findings suggest that the transcription and translation of gliadin genes are regulated in complex manners in wheat, which require further efforts to be fully elucidated.
From the data available, it seems that wheat cultivars may accumulate approximately 20-30 α-gliadins, 10-15 γ-gliadins, 1-3 δ-gliadins, 5-8 ω-gliadins, 10-15 LMW-GSs and 3-5 HMW-GSs in their grains. These rough estimates may help future proteomic and biochemical studies of the main gluten proteins in commercial wheat. Meanwhile, additional work is needed to investigate the composition and accumulation levels of gluten proteins in more diverse wheat germplasm materials. This is particularly relevant for α-gliadins because their coding genes are more numerous and there are large discrepancies in the reported numbers of α-gliadin genes in the same wheat material or among different wheat genotypes. For example, 47 α-gliadin genes were identified for CS based on analyzing the reference genome sequence (Huo et al. 2018b), but 90 such genes were detected for CS through sequencing PCR clones (Noma et al. 2016). Anderson et al. (1997) estimated 150 α-gliadin genes for the wheat cultivar Cheyenne based on Southern blot hybridization signals. Therefore, in future studies on the expression of α-gliadins, it is essential to determine the precise copy numbers of their genes in the wheat lines to be examined using more accurate genomic or molecular techniques, such as targeted sequence capture or droplet digital PCR (Altenbach et al. 2019a, b;Jouanin et al. 2019b). This information, coupled with high-throughput grain transcriptomic and proteomic data, may enable efficient elucidation of the mechanisms underpinning the transcriptional and translational regulations of α-gliadin genes in wheat.
It is important to point out that many wheat cultivars and germplasm lines accumulate rye secalins in the grains because of carrying the 1BL/1RS translocation chromosome (Graybosch 2001). The Sec-1 locus of 1RS harbors the genes that encode γ-and ω-secalins, which resemble wheat γ-gliadins and ω-gliadins, respectively (Chai et al. 2005;Clarke et al. 1996). Although these secalins have been found to negatively affect gluten, dough and end-use properties by many genetic and rheological studies (Barbeau et al. 2003;Dhaliwal et al. 1990), the precise copy numbers of γ-and ω-secalin genes carried by Sec-1 are still not well understood, neither is it clear how many distinct γ-and ω-secalins are accumulated in the grains of 1BL/1RS wheat varieties. Nevertheless, three studies have estimated the size of Sec-1 to be at least 145 kb or 195 kb and contained 15 or 18 ω-secalin genes (Clarke et al. 1996;Li et al. 2016;Yamamoto and Mukai, 2005). The transcripts for 17 different ω-secalin genes were detected in the developing grains of the 1BL/1RS variety Shimai 15 by PCR cloning of cDNAs (Li et al. 2016). Four ω-secalin protein bands were detected in a SDS-PAGE analysis of 14 1BL/1RS cultivars (Chai et al. 1 3 2016a, b), and multiple secalin protein spots were found in 2-DE/MS studies of the wheat genotypes carrying 1BL/1RS (Blechl et al. 2016;Gobaa et al. 2007). These data should aid more detailed proteomic investigations of secalin expression in wheat grains in the future.

Proteomic investigation of posttranslational modification of gluten proteins
Posttranslational modification (PTM) is an important issue frequently investigated in proteomic analysis of gluten proteins. However, there is still no strong evidence for extensive PTMs of gluten proteins in wheat grains or during dough processing. Nevertheless, several types of PTMs have been recorded for specific gluten proteins. In HMW-GSs, the 1By subunits (e.g., 1By8 and 1By9) have been shown to undergo two posttranslational cleavages at the C-terminal tail, resulting in two minor proteoforms visible on protein gels (Nunes-Miranda et al. 2017). The enzyme responsible for the cleavages may be an asparaginyl endopeptidase. The loss of C-terminal end, which includes a conserved cys residue involved in disulfide bond formation, may have a negative impact on the promotion of gluten and dough functionalities by 1By subunits (Nunes-Miranda et al. 2017). In LMW-GSs, PTM has been found to play a significant role in the processing of the N-terminal end of m-and s-type subunits (Dupont et al. 2011;Egidi et al. 2014). By combining transgenic and proteomic investigations, Egidi et al. (2014) proposed a model for the maturation of m-and s-type LMW-GSs. For the m-type subunits, the signal peptide (19 residues) is processed, followed by a further removal of the subsequent glutamine residue by an aminopeptidase cleavage, thus leading the mature proteins starting with METSCIF; For the s-type subunits, in addition to signal peptide processing and removal of glutamine residue, a third step, possibly mediated by an asparaginyl endopeptidase, is executed, which removed the MEN fragment, and generates the mature protein starting with SHIPGL. In gliadins, an asparaginyl endopeptidase-mediated cleavage has been found involved in the processing of the ω-gliadins specified by Gli-A1 or -D1 loci (Dupont et al. 2004). Such a cleavage has also been proposed to occur in the processing of some farinin proteins (Kasarda et al. 2013). Finally, widespread glutamine deamidation in different types of gluten proteins has been revealed by several proteomic studies, although it is currently unclear if this modification occurs genuinely in the grains or caused by sample treatment during proteomic analysis (Bromilow et al. 2017a;Dupont et al. 2011;Martínez-Esteso et al. 2016). Phosphorylation, a common form of PTM, is not found for gluten proteins, although it is readily detected in many wheat grain proteins involved in diverse physiological and biochemical processes Zhang et al. 2014;Zhen et al. 2017).

Composition of gluten polymers as investigated using proteomic analysis
As pointed earlier, the quantity and size distribution of GMPs are important indicators of dough and end-use properties, and GMPs are quantitatively and functionally correlated with the amount of UPP complexes (see above). As demonstrated by Vensel et al. (2014), proteomic analysis represents an efficient tool for investigating the composition of UPP in wheat. In their study, the composition of a major UPP fraction (UPP peak 1), which was insoluble in 0.5% SDS and hence contained GMPs, was analyzed using 2-DE-MS/MS. HMW-GSs and LMW-GSs were found to be the main components, accounting for 28.52% and 44.72% of the fraction, respectively. The α-, γ-and ω-gliadins with an odd number of cys residue, and thus acting as glutenin chain terminators, were also identified (5.43%). In addition, this fraction contained monomeric gliadins (12.61%), serpins (3.41%), triticins (3.84%) and globulins (0.57%), which together made up 20.43% of the fraction. In parallel, the same study analyzed protein composition of the major extractable polymeric protein (EPP) fraction (EPP peak 1), which was soluble in 0.5% SDS and presumably contained smaller gluten polymers. HMW-GSs, chain terminating gliadins, serpins, triticins and globulins were also present in EPP, but their proportions were 20.18%, 14.67%, 5.74%, 7.21% and 2.55%, respectively. These data suggest that glutenins (HMW-GSs and LMW-GSs) are the dominant component in UPP. In contrast, EPP contains a decreased amount of HMW-GSs, but increased quantities of chain-terminating gliadins, serpins, triticins and globulins. Mueller et al. (2016) prepared and analyzed GMP gel, which is formed by the largest gluten polymers (with the molecular mass ranging from 5 to 20 million Da). They found that the major protein component in GMP gel is glutenins (~ 90%), with gliadins occupying only ~ 10%. No albumins or globulins were found in the GMP gel analyzed. From the information available, it seems that the larger the size of the gluten polymers the higher proportion of glutenins they contain, with gliadins and other proteins decreased accordingly. However, it is well known that the amount and size distribution of gluten polymers are affected by both genotypes and environments (Johansson et al. 2013;Ni et al. 2014;Zhao et al. 2011;Zhang et al. 2016). Therefore, further proteomic work using more wheat varieties cultivated in different environments is needed in order to obtain a better understanding of the dynamic changes in the composition of gluten polymers in response to genetic background and growth conditions.

External factors on gluten protein accumulation
External changes, caused by fluctuations of environmental factors or application of cultivation measures, can have large effects on the performance and stability of end-use quality, which is mediated, at least partly, by altered expression, accumulation and function of gluten proteins (Altenbach 2012(Altenbach , 2017. Proteomic analysis provides an effective tool for deciphering the changes of wheat grain proteome (including gluten proteins) induced by external factors at a genome-wide level, and the findings have been reviewed in depth by Altenbach (2017). From the data reported so far, it appears that abiotic stresses generally induce complex proteomic changes in wheat grains, including decreased expression of the proteins and pathways involved in normal growth and physiological processes but up-regulated expression and function of those required for stress adaptation and tolerance, accompanied by significant reductions in graining filling period and kernel weight. Heat, drought or salt stress applied during flowering or post anthesis tend to increase the accumulation of α-and ω-gliadins and HMW-GSs, but exhibit differential effects on the accumulation of different LMW-GSs, with genotypes, types of stresses and growth stages when stress was encountered having significant influences on the changes of gluten proteins (Hurkman et al. 2013;Zhang et al. 2018a, b;Zhou et al. 2018;Yang et al. 2011). The expression of α-gliadins seems to be more strongly affected by heat and drought, indicating that the regulation of these proteins is more sensitive to abiotic stresses. Rebalancing of gluten protein accumulation often occurs, with the amount increased for some members but decreased for the others. These changes can lead to elevation of grain protein content (GPC), resulting in improvement of end-use quality-related parameters under stress conditions. Changes in fertilizer application likewise trigger complex alterations in wheat grain proteome, with the effects varying according to the types and amounts of fertilizers applied and the timing of application (Altenbach et al. 2011;Altenbach 2017;Xue et al. 2019;Zörb et al. 2018). Appropriate application of nitrogen can significantly promote gluten protein accumulation and end-use quality. For example, nitrogen applied at the booting stage increases GPC and the contents of most gluten proteins, especially HMW-GSs and α-and ω-gliadins, which is accompanied by the formation of more and larger protein bodies and enhanced expression of several protein disulfide isomerases required for efficient GMP formation (Xue et al. 2016(Xue et al. , 2019Yu et al. 2017;Zhong et al. 2019). High nitrogen application also elevates GPC and accumulation levels of many gluten proteins, although it may lead to reduced nitrogen use efficiency Zheng et al. 2018;Zörb et al. 2018). Interestingly, Roy et al. (2019) demonstrated that restoring the expression of the HMW-GS gene Glu-1Ay, which is normally silenced in worldwide common wheat varieties, leads to increased GPC and breadmaking quality in the Australian variety Lincoln, indicating a new way of enhancing grain N accumulation and nitrogen use efficiency by enlarging the number of expressed HMW-GSs in wheat. Furthermore, a breakthrough in dissecting the molecular mechanism underlying nitrogen promotion of wheat GPC and GMPs was reported recently, which shows that adequate nitrogen supply enhances the availability of glutamine for different biological processes during grain development, and in the meantime, elevates GMPs by up-regulating the function of peptidyl-prolyl cis-trans isomerase (PPIase) through SUMOylation of PPIase with the aid of the small ubiquitin-related modifier 1 .
Lastly, sufficient availability of sulfur element in the soil has been shown to facilitate the synthesis of gluten proteins, particularly the S-rich α-and γ-gliadins and LMW-GSs (Bonnot et al. 2017;Grove et al. 2009;Zörb et al. 2010). Applications of phosphorous, magnesium, zinc and manganese fertilizers have also been found to confer beneficial effects on gluten protein synthesis in wheat grains (Gaj et al. 2013), although the specific proteomic changes involved remain to be determined.

Genome-wide analysis of wheat sensitivity-related gluten proteins
Before the availability of wheat reference genome sequence, studies of wheat sensitivity-related gluten proteins and epitopes were largely limited to individual, or specific type(s), of gluten members (Gilissen et al. 2014;Shan et al. 2002;Tye-Din et al. 2010;van Herpen et al. 2006). By analyzing the whole genome sequence of CS, Juhász et al. (2018) recently mapped and experimentally tested wheat immunoresponsive proteins at a genome-wide level. From the available studies, several main themes regarding CD and WDEIA have become apparent, with genome-wide information beginning to emerge for other wheat sensitivities (barker's asthma and NCWS).
First, the T cell CD epitopes are present in all major families of gluten proteins (HMW-GSs, LMW-GSs and gliadins), with the most important ones detected in the α-and ω-gliadins encoded by wheat D subgenome. Juhász et al. (2018) identified 12 α-and ω-gliadins with comparatively high immune response, which were all encoded by D subgenome. The highly toxic CD epitopes carried by Gli-D2 encoded α-gliadins are located in a 33-mer gliadin peptide resistant to the digestion by human protease (Shan et al. 2002). However, the number of gliadins with the 33-mer peptide is quite limited in number. In both CS and Xy81, only two α-gliadins were found to possess the 33-mer peptide among the diverse ranges of gliadins analyzed (Juhász et al. 2018;Li et al. 2018). The gluten proteins specified by B subgenome tend to carry fewer CD epitopes with weaker immunogenic potential (Juhász et al. 2018).

3
Second, CD epitopes are primarily located in the repetitive region of gluten proteins; the C-terminal domain can also carry CD epitopes but with weaker immunogenic potential (Juhász et al. 2018;Shewry 2019). For many gluten proteins, especially γ-gliadins, Gli-D2 α-gliadins and the ω-gliadins encoded by Gli-D1, there usually exist multiple CD epitopes in their repetitive domain.
Third, a considerable proportion of gluten proteins do not carry, or have very few of, the CD epitopes known to date, particularly the α-gliadins encoded by wheat B subgenome. In CS, at least 10 α-gliadins do not carry CD epitopes, 9 of which are encoded by B subgenome (Gli-B2) (Huo et al. 2018b). Among the 38 α-, γ-, δ-and ω-gliadins found to accumulate in Xy81 grains, 10 members do not carry CD epitopes, which include 7 α-gliadins encoded by Gli-B2, 1 α-gliadin by Gli-D2, 1 ω-gliadin by Gli-B1 and 1 δ-gliadin by Gli-D1; 8 members, including 6 encoded by Gli-A2 and 2 by Gli-B2, have only 1 or 2 CD epitopes in their proteins . In wheat, the α-gliadins that do not carry or possess only 1 to 2, CD epitopes generally carry the CSTT motif; the great majority of CSTT gliadins are encoded by B subgenome (11 in CS and 10 in Xy81), with 1 or 2 specified by A or D subgenome (Huo et al. 2018b;Wang et al. 2017).
Fifth, no strong evidence has been obtained for the involvement of glutenins and gliadins in the elicitation of NCWS Cabanillas 2019), although there is evidence that consumption of low-gliadin bread confers beneficial changes to gut microbiota of NCWS patients (García-Molina et al. 2019). However, certain subclasses of ATIs are emerging as contributors to NCWS (Zevallos et al. 2017). The ATIs and many nsLTPs also contain the epitopes associated with baker's asthma (Juhász et al. 2018).
Finally, the allergenic potential of gluten proteins is significantly affected by the growth environment. This is understandable considering that the expression of gluten proteins is frequently modulated by environmental factors and cultivation measures (see above). From the information available (Altenbach 2017; Juhász et al. 2018), it appears that high temperature stress increases the levels of CD epitopes owing to stimulation of α-and ω-gliadin accumulation, particularly the α-gliadins carrying the 33-mer peptide, while low temperature stress decreases the level of CD epitopes but increases the amount of certain immunostimulatory factors associated with WDEIA or baker's asthma. Brzozowski and Stasiewicz (2017) found that water stress at the flowering stage increased the levels of immunostimulatory ω-gliadins. Boukid et al. (2017) reported that the level of toxic CD epitopes was affected by complex interactions among wheat cultivars, growth season's climate conditions and breeding histories of the examined wheat varietal population. Therefore, a comprehensive understanding of the effects of growth conditions on the immunogenic potential of gluten proteins may come from the studies involving more diverse wheat genotypes cultivated in different environments.

Prospect for concurrent improvement of wheat end-use and health-related traits
Although it is highly desirable to simultaneously improve grain end-use and health-related traits, the task is very challenging because many of the gluten proteins involved in wheat sensitivities are actually important participants of wheat end-use quality control. However, there are encouraging findings: a number of studies have shown that decreasing gliadin accumulation can reduce wheat sensitivity-related epitopes without affecting wheat end-use quality, and many gluten proteins do not carry the known wheat sensitivityrelated epitopes. Thus, several strategies have been explored to identify wheat genotypes with reduced gluten protein accumulation or to create genetically modified lines with decreased gluten content . The following is a brief summary of the five main approaches that are promising for simultaneously improving wheat grain enduse and health-related traits through removing (or modifying) the toxic gluten proteins while enhancing the functions of the gluten members without disease epitopes (Table 3).
The first approach is to use RNA interference (RNAi) to silence the expression of all, or specific types of, gliadins in transgenic wheat plants (Table 3). In general, the gliadin silenced lines showed decreased gliadin content, lowered immunogenic potential and improved end-use quality parameters (Becker et al. 2012;Gil-Humanes et al. 2008, 2014aPistón et al. 2011;Zörb et al. 2013). For example, Barro et al. (2016) could concurrently silence the expression of α-, γ-and ω-gliadins using combinations of RNAi constructs, which eliminated CD epitopes from the highly immunogenic α-and ω-gliadins but without affecting total protein and starch contents in the grains. RNAi has been also used successfully to silence the expression of ω-5 gliadins; the resultant lines exhibited improved flour quality and may be useful for decreasing the incidence of WDEIA (Altenbach and Allen 2011;Altenbach et al. 2014aAltenbach et al. , b, 2015. Recently, Altenbach et al. (2019a, b) decreased the expression of ω-1,2 gliadins by RNAi, with the resulting lines showing significantly reduced immunogenic potential and Table 3 Approaches used for decreasing immunogenic potential of wheat gluten proteins Approach Gluten proteins targeted Main finding References RNA interference of gliadin genes γ-Gliadins Reduced expression of γ-gliadins in nine transgenic lines, accompanied by increased levels of α-and ω-gliadins; higher SDS-sedimentation values observed in six transgenic lines. Gil-Humanes et al. (2008) and Pistón et al. (2011) α-, ω-and/or γ-gliadins Gliadin expression strongly down-regulated (by 85.6% on average); transgenic wheat lines with very low levels of toxicity for CD patients obtained; many of the transgenic lines exhibiting improved end-use quality and nutritional value Gil-Humanes et al. (2010, 2014a ω-5 gliadins Content of ω-5 gliadin decreased by 80% in one line and eliminated in another line; reactivity to the IgE antibody of WDEIA patients greatly reduced; dough functionality parameters improved Altenbach and Allen (2011) and Altenbach et al. (2014aAltenbach et al. ( , b, 2015 α-Gliadins Content of α-gliadin strongly reduced in two transgenic lines, compensated by increased levels of γ-and ω-gliadins, HMW-GSs and other seed proteins; no significant effect on flour functionality observed Becker et al. 2012 α-, γ-and ω-gliadins, LMW-GSs Strong silencing of different types of gliadins and LMW-GSs achieved; several lines devoid of CD epitopes from the highly immunogenic α-and ω-gliadins; total protein and starch contents unaffected in the grains of the transgenic lines Barro et al. (2016) Secalins in 1BL/1RS wheat line Significant decreases in multiple secalins and closely related ω-gliadins; increased levels of α-gliadins and HMW-GSs; improved dough functionality for two transgenic lines substantially improved end-use quality parameters. The rye secalins expressed in wheat background due to the presence of 1BL/1RS translocation chromosome can be effectively silenced using RNAi, with the transgenic lines showing enhanced dough functional properties (Blechl et al. 2016;Chai et al. 2016a, b). Lastly, Gil-Humanes et al. (2014b) revealed that the low-gliadin transgenic wheat lines produced using RNAi had also improved nutritional quality because of increased lysine content in the grains; García-Molina et al. (2019) noted that consumption of the bread made with the flour of a low-gliadin wheat (E82), which showed 98.1% reduction of gluten content when evaluated using the R5 antibody, induced positive changes in the composition of gut microbiota in NCWS patients. The second approach is to develop wheat deletion lines lacking one or more gliadin chromosome loci (Table 3). Waga et al. (2013) developed and assessed three gliadin deletion lines with null allele at Gli-D1, Gli-B1 or Gli-B2 and found that the immunoreactivity of flour proteins of the deletion lines was 6-18% lower than that of wild-type control. Subsequently, wheat genotypes lacking both ω-1,2 and ω-5 gliadins were developed, which had a 30% decrease in gliadin immunoreactivity but improved gluten content and strength (Waga and Skoczowski 2014). Similarly, Camerlengo et al. (2017) described three wheat deletion lines lacking Gli-A2, Gli-D2 and Gli-A2/Gli-D2, respectively. These mutant lines had large decreases in α-gliadin expression, with the 33-mer peptide bearing gliadins not detected in the ones missing Gli-D2 or Gli-A2/Gli-D2. Wang et al. (2017) reported the development of six wheat deletion lines each lacking one of the six gliadin chromosome loci. The line DLGliD2, which has Gli-D2 deleted, showed improved dough functionality and breadmaking quality, with the level of CD epitopes significantly decreased Wang et al. 2017). Being non-transgenic, these deletion lines may be directly used in developing the wheat lines less toxic to the individuals affected by wheat sensitivity problem.
The third approach is to develop transgenic wheat lines expressing engineered "glutenases" for targeted degradation of celiac inducing epitopes in the intestine (Table 3). Osorio et al. (2019) created transgenic wheat lines with endosperm specific expression of barley endoprotease B2 (EP-HvB2), Flavobacterium meningosepticum prolyl endopeptidase (PE-FmPep) and Pyrococcus furiosus prolyl endopeptidase (PE-PfuPep). These preconditioned gluten detoxifiers (EP-HvB2 + PE-FmPep or EP-HvB2 + PE-PfuPep) did not affect the end-use quality of flour, but could degrade the CD epitopes contained in the 33-mer gliadin peptide under simulated gastrointestinal conditions. Up to 72% reduction in the immunogenic peptides was found for the transgenic lines, thus opening the possibility of developing an intraluminal enzyme therapy for CD without negatively affecting wheat end-use quality and overall agronomical performance.
The fourth approach is to decrease the expression of gluten proteins through manipulating the regulators controlling prolamin gene expression (Table 3). Wen et al. (2012) demonstrated that functional suppression of wheat DME gene, which encodes 5-methylcytosine DNA glycosylase, led to decreased accumulation of LMW-GSs and gliadins. Recently, Moehs et al. (2019) showed that elimination of the homoeologous genes encoding WPBF resulted in decreased accumulation of LMW-GSs and gliadins, which together accounted for 50-60% of wheat gluten proteins. These regulatory genes are potentially useful targets for developing low-gluten wheat lines, although efforts are needed to alleviate the side effects on agronomic traits associated the mutation of these genes.
The fifth approach is to modify gluten gene expression using genome editing (Table 3). Genome editing is a rapidly developing technology for introducing site targeted mutations to genic and regulatory regions (Knott and Doudna 2018;Yin et al. 2017). It consists of a nuclease (e.g., Cas9 and Cpf1) and a single guide RNA (sgRNA); the sgRNA is complementary to the target site, which binds to the nuclease and then directs the ribonuclear protein complex to the specific target site. Depending on the methods used, the editing results in either indel mutations or base changes (i.e., A to G or C to T) at the target site (Chen et al. 2019). Genome editing can be performed for single or multiple genes with one or more sgRNAs. Using CRISPR/Cas9 mediated genome editing, Sánchez-León et al. (2018) succeeded in mutating a large number of α-gliadin genes in wheat (up to 35), with the immunoreactivity of gluten proteins reduced by as much as 85%. Jouanin et al. (2019a) confirmed that CRISPR/Cas9 is effective in mutating α-gliadin genes and further showed this method could be used to mutate γ-gliadin genes in wheat.
Of the different approaches outlined above, genome editing is relatively new, and its utility in modifying gluten protein expression remains to be fully exploited. In addition to mutations created by indel-inducing CRISPR/ Cas9, as demonstrated by Sánchez-León et al. (2018), various types of base editors, as reviewed recently (Chen et al. 2019;Mishra et al. 2019), may be employed to correct the gluten proteins that are important in end-use quality control but harbor wheat sensitivity-related epitopes. The versatility of genome editing took a big step forward recently with the development of prime editing, which can engineer all 12 forms of base substitutions, insertions (1 to ≥ 44 bp), deletions (1 to ≥ 80 bp) and combinations of the different types of alterations in a predetermined target site (Anzalone et al. 2019). The ability to conduct multiplex editing makes it possible to modify different families and subtypes of gluten genes in a high throughput manner. Lastly, the different genome editing methods may also be useful for enhancing the end-use quality controlling function of those epitope-free gluten members. These attributes, plus 1 3 the ease to obtain genome-edited but transgene-free wheat plants (Sánchez-León et al. 2018), support the idea that genome editing has the highest potential in refining gluten protein composition for concurrent improvement of wheat end-use and health-related traits.

Conclusion
Genomic and functional genomics studies have substantially improved the understanding on gluten chromosomal loci and genes and the mechanisms regulating gluten protein expression. It is now possible to elucidate the complete repertoire of gluten genes and proteins in wheat and to monitor their expression changes at whole genome level in response to alterations in the growth environments. Furthermore, an important molecular clue underlying the promotion of gluten protein functionality by favorable environmental factor, i.e., enhancement of GMPs by SUMOylation of PPIase under adequate nitrogen conditions, has emerged. Meanwhile, genome-wide insights have been gained into the types and structures of immunogenic gluten proteins. Valuable approaches have been tested for simultaneously improving wheat end-use and health-related traits. However, there are still important gaps in the knowledge on the molecular networks controlling gluten gene expression and on the biochemical and biophysical mechanisms underlying gluten protein interactions. In addition, more efforts are needed to grasp, and to make efficient use of, the large genetic variations in gluten protein structure and expression in wheat germplasm. Looking into the future, the combination of genomic, functional genomics and genome editing studies will speed up the basic and applied research on gluten proteins, thus enabling efficient development of elite wheat varieties with the end-use and health-related traits desired by different consumption needs.