Keywords

15.1 Gene Discovery in the Context of Wheat Improvement and Breeding

If you compare two bread wheat (Triticum aestivum L.) cultivars, the chances are that you will find differences between themā€”and lots of them. Whether these differences are for agronomic traits, such as resistance to disease, for quality traits such as those important for bread making, or for a range of morphological traits such as those used to uniquely ā€˜describeā€™ a variety during varietal registration (Jones et al. 2013), such variation is abundant. It is the heritable component of these observable differences that is exploited via breeding to deliver new improved wheat varieties and deals with the complexities of pleiotropic effects resulting from the process. The question as to how best to do this is not a straightforward one. To give a simplified example, phenotypic selection for underlying combinations of genes and alleles that result in increased grain number per ear may result in fewer ears overall. Similarly, increasing the grain protein is often associated with a reduction in overall grain yield in wheat (Simmonds 1995; White et al. 2022) and other crop species (e.g. Dudley 2007), and increasing leaf size is thought to result in larger, but less dense stomata (Zanella et al. 2022). As the principal breeding target, grain yield represents the sum of all interacting genetic/epigenetic, environmental and management factors that occur from sowing to harvest. Selection for grain yield works well, with breeders having consistently deliveredā€‰~ā€‰1% genetic gains per year in wheat yield potential over recent decades (e.g. Mackay et al. 2011). To some extent, wheat breeding practices focus on delivering performance under the assessment criteria and carefully managed growth conditions used by national bodies to determine subsets of the ā€˜bestā€™ varieties marketed at a given time. In the United Kingdom (UK), for example, the annual AHDB ā€˜Recommended Listā€™ provides performance data for such varietal subsets to help farmers choose which varieties to grow (www.ahdb.co.uk/knowledge-library/reccommended-lists-for-cereals-and-oilseeds-rl). However, on-farm wheat yields are increasingly falling behind the genetic potential of the varieties grown. Termed the ā€˜yield-gapā€™, and observed in wheat growing areas across the world (Senapati et al. 2022), this is likely to be due to the costā€“benefit and practical considerations and trade-offs that take place under commercial farm conditions. Future wheat production will face additional challenges such as environmental change, soil degradation, increasing energy and input costs, and the effects of political conflict or instability. Thus, wheat genetic improvement will increasingly need to focus on yield stability under sub-optimal, fluctuating or unpredictable growth environmentsā€”delivered within the context of more sustainable food production systems. As the development of new wheat varieties is a relatively lengthy process (typically taking around 10Ā years), all available tools must be exploited to meet these challenges. As underpinning technologies advance, the ability to identify specific wheat genes or genetic loci, and understand how they function and interact within the context of crop performance, will play an increasingly important role towards delivering future wheat.

15.2 Genetic Variation in Hexaploid Bread Wheat: Luck, Bottlenecks and Breeding

If the foundation of gene discovery is heritable variation, then before exploring the germplasm and genomic resources currently in use to accelerate gene discovery and functional analysis in wheat, it is first worth briefly considering the history behind current wheat genetic variation. Collectively, the natural genetic variation present in modern day wheat represents the culmination of the speciation, domestication and breeding events and processes that have occurred in its past. Human selection and interventions have affected the wheat genome and the variation it contains, starting from its first origins in Neolithic farmersā€™ fields, up to the current day. However, variation at the DNA level is not so evenly distributed across the bread wheat genome. To some extent, this is due to the order, age and nature of the polyploidisation events that occurred during its speciation. The bread wheat genome is hexaploid (2nā€‰=ā€‰6xā€‰=ā€‰42), which means it consists of three subgenomes that have merged via inter-species hybridisation events during its evolutionary history (reviewed by Levy and Feldman 2022; Fig.Ā 15.1). Notably, the most recent event was a spontaneous hybridisation around 9000Ā years ago between the tetraploid progenitor of pasta wheat (the AA and BB subgenome donor) and a diploid wild wheat relative that grew alongside it called ā€˜goat grassā€™ (Aegilops tauschii Coss., DD subgenome donor) to create hexaploid bread wheat (AABBDD). Due to this event being rare, recent, and having occurred in a restricted Ae. tauschii sub-population close to the Caspian Sea (Wang et al. 2013), little D subgenome variation was captured, and there has been comparatively little time for genetic variation to subsequently accumulate via spontaneous mutation. The effect of this is evident in genetic analyses of bread wheat varieties from across the world (e.g. Wang et al. 2014; Walkowiak et al. 2020; Mellers et al. 2020), where D subgenome variation within genes is typically one-third to one-tenth of that seen on the A and B subgenomes. Consistent throughout the wheat subgenomes however is that gene density and gene variation are lower across the centromeric and adjacent pericentromeric chromosomal regions than in the remaining more distal chromosomal positions (IWGSC 2018). These centromeric and pericentromeric regions are associated with higher frequency of transposable elements (IWGSC 2018), higher levels of epigenetic modifications to DNA and histones associated with heterochromatin (tightly packed DNA), and lower genetic recombination (Gardner et al. 2016; Gardiner et al. 2019), which together are thought to result in the restricted rates of genome evolution observed in these regions (Akhunov et al. 2003). Against this genomic backdrop, in theā€‰~ā€‰9000Ā years since the speciation of bread wheat has been accumulating natural mutations which have either been retained or lost along the way due to a combination of selection, drift and geneflow. Such shifts in variation have underpinned the many generations of ā€˜on-farmā€™ selection that occurred from Neolithic times up until the advent of industrial breeding approaches at the end of the nineteenth century. Accordingly, wheat genetic variation was modulated across this time period by the interplay between human selection, be it conscious (such as selection for larger grains) or unconscious (such as selection for photoperiod insensitive lines; Jones and Lister 2022), and environmental factors such as prevailing climate and disease pressures. This ongoing domestication process resulted in the numerous locally adapted ā€˜landracesā€™ that were grown across the worldā€™s wheat growing regions up until the end of the 1800s. Early breeders exploited these sources of genetic diversity by systematically selecting and evaluating such landraces, as well as the crosses made between them. The outcomes of this history are still evident in modern wheat varieties, as these first breeding programmes commonly exploited the landraces that were locally adapted to their regions at the time. Evidence of this history can be seen in modern day wheat. For example, genetic marker analysis of wheat from around the world shows clustering of Chinese landraces and cultivars in genetic diversity space (Cavanagh et al. 2013), while in an analysis of 180 UK varieties released since the year 2000, almost 90% include genetic contributions from the old Ukrainian landrace OSTKA-GALICYJSKA and the Mediterranean landrace from which the early UK variety SQUAREHEAD was developed (Fradgley et al. 2019). Over the years, there have been concerns that the industrial breeding era has resulted in genetic bottlenecks in numerous crops, and that this has restricted genetic diversity in modern wheat. While there are many approaches to measure genetic diversity loss (reviewed by Khoury et al. 2021), for wheat it is clear that more genetic diversity was present in the landraces versus pure-line bred cultivars (e.g. Winfield et al. 2018). The assumption of loss of diversity when within the modern breeding period is not necessarily so apparent, with changes in diversity depending on multiple factors, including the time period and region studied. One factor that has been noted is a reduction in genetic diversity at and soon after the introduction of the ā€˜Green Revolutionā€™ semi-dwarfing genes across all international breeding programmes from the 1960s onwards (see Chap. 11). However, recent studies of on-farm wheat diversity indicate that at a national level, growers may now actually deploy a much more diverse portfolio of cultivars than was used 100Ā years ago. For example, in the USA the number of major commercially grown wheat cultivars has increased progressively, increasing fivefold from 1919 (33 cultivars) to 2019 (186 cultivars) with pedigree-based diversity measures of 1353 commercial USA varieties grown across this period indicating this increase in cultivar diversity is likely linked to increased genetic diversity (Chai et al. 2022). In the UK, combining measures of relatedness based on shared parentage (kinship), weighted by the proportional yearly acreage of cultivars over the last 30Ā years, found an increasing trend in the resulting landscape diversity index (Fradgley 2022). While the dominance of a very low number of varieties across national cropping landscapes is not as common as it once was (such as the use of cv. CAPPELLE-DESPREZ across more than 50% of the UK cropping area in the 1960s; Srinivasan et al. 2003), this is not necessarily the case throughout the wheat growing regions of the world. For example, between 2005 and 2010 the cultivar WYALKATCHEM represented more than 30% of the Australian wheat area sown, while more recently cv. MACE represented over 65% of the wheat cropping area in both 2015 and 2016 (Phan et al. 2020). Notably, these recent examples of low Australian landscape scale cultivar diversity are set against a wider background of a reduction in Australian wheat genetic diversity post Green Revolution (Joukhadar et al. 2017) and highlight the potential vulnerability of such landscape scale cultivar predominance to changes in pest and environmental pressures.

Fig. 15.1
figure 1

Evolutionary history of hexaploid bread wheat (Triticum aestivum) from its diploid and tetraploid donors progenitors. The unknown or extinct wheat B subgenome donor is a derivative of the S-genome species of the section Sitopsis, which includes diploid Ae. speltoides (diploid SS genome), Ae. bicornis (SbSb), Ae. longissima (SlSl), Ae. searsii (SsSs) and Ae. sharonensis (SshSsh). MYAā€‰=ā€‰millions of years ago

15.2.1 Systematic Broadening of the Wheat Genepool as Wild Wheats Are Deployed

A longstanding concern is that breeding results in loss of genetic diversityā€”however, as noted above this assumption is not a given. A good example in cereals is the maize long-term selection experiment, where continuous genetic gains within a closed population in response to selection for seed protein and oil content were observed across the 100-year programme, with no significant loss in genetic diversity (Dudley 2007). Presumably, this was achieved via continued selection for genetic loci of small additive effect, as well as the fixing of epistatic interactions (i.e. instances where the allele of one gene hides or masks the phenotype of another gene) as additive effects. It is thus feasible to optimise existing variation present in wheat cultivars into new combinations, and to bring in additional genetic and functional diversity from systematic introgression and analysis of chromosomal regions originating from landraces and species related to wheat. When present in otherwise elite wheat genetic backgrounds, the chromosomal segments present in such ā€˜wilder wheatsā€™ can often provide agronomic performance gains, despite the possible negative impacts of such chromosomal tracts (due, for example, to linkage drag or local effects on genetic recombination). Reminiscent of the activities at the start of the industrial breeding age, initiatives across the world are once again systematically screening variation captured in wheat landraces and are now supported by modern genetics, genomics, experimental population designs and analysis approaches. For example, the Watkins bread wheat landrace collection of 826 accessions from 32 countries has been genotyped using 41 microsatellite markers (Wingen et al. 2014), and selected accessions from across the genetic diversity space crossed to an elite spring cultivar to create a series of bi-parental genetic mapping populations (Wingen et al. 2017), termed a nested association mapping (NAM) panel. The benefits afforded by ā€˜wilder wheatsā€™ created via introgressions from wheat relatives are illustrated by the UK cultivar ROBIGUS. Released in the UK in 2020, ROBIGUS delivered high yields and contained particularly novel genetics derived from a wheat wild relative (Gardner et al. 2016) and has been frequently used in the pedigrees of subsequent UK varieties (Fradgley et al. 2019)ā€”without associated loss of wheat cultivar genetic diversity at landscape scale (Fradgley 2022). Genomic analyses now show that the presence of introgressions from wheat relatives is relatively common (e.g. Cheng et al. 2019; Keilwagen et al. 2022; Pont et al. 2019; Przewieslik-Allen et al. 2021; Scott et al. 2020a). Indeed, introgressions often underlie genomic regions conferring agronomically important traitsā€”particularly disease resistance (Aktar-Uz-Zaman et al. 2017). For example, resistance to the wheat fungal disease yellow rust conferred by Yr34 originated from a region of chromosome 5A introgressed over 200Ā years ago from einkorn wheat (T. monococcum L. ssp. monococcum; Chen et al. 2021), and still confers field resistance in the US (Chen et al. 2021) and UK (Bouvet et al. 2022b). The long breeding history of use and utility of introgression from wheat relatives is exemplified by the extensive use since the late 1980s of synthetic hexaploid wheats in the international wheat breeding programme run by the International Maize and Wheat Improvement Center (CIMMYT) (Das et al. 2016; see also Chap. 11). Synthetic hexaploid wheats address the lack of genetic diversity on the wheat D subgenome by recreating the ancient hybridization event between tetraploid wheat and Ae. tauschii. This is undertaken via inter-specific crosses followed either by embryo rescue, chromosome doubling (Li et al. 2018a, b) or use of specific cytogenetic stocks (Othmeni et al. 2022). While more than 1200 synthetic wheats have been generated by CIMMYT, historically these have sampled a relatively narrow range of Ae. tauschii diversity from the eastern Fertile Crescent. Systematic broadening of the diversity sampled in synthetic wheats is now being undertaken at pre-breeding initiatives at NIAB in the UK, where D subgenome Ae. tauschii genetic diversity from across its natural eco-geographic range is being captured in new synthetics and backcrossed into elite cultivars (Gaurav et al. 2022). While this and other initiatives (e.g. Zhou et al. 2021) are providing new sources of D subgenome genetic variation for breeding, similar approaches are systematically bringing in additional diversity from wheat A and B subgenome donors via the creation of inter-specific hybrids and subsequent backcrossing. For example, the generation of backcross-derived progenies from crosses between 59 diverse accessions of tetraploid T. turgidum ssp. durum with elite spring wheat cv. PARAGON (see also Chap. 8). Introgressions into elite wheat varieties from more distantly related diploid and polyploid grass species are also being generated, including Ambylopyrum muticum (TT genome, Coombes et al. 2022) and Thinopyrum species (Li and Wang 2009; Grewal et al. 2018; Li et al. 2018a, b; Cseh et al. 2019; Baker et al. 2020). The utility of genetic loci originating from the tertiary wheat genepool has begun to lead in the identification of the underlying genes and genetic variants; for example, the wheat Fhb7 locus conferring resistance to the fungal disease Fusarium head blight, and which originated from a Th. elongatum introgression, has been shown to encode an amino acid transferase that detoxifies toxins produced by the infecting fungus (Wang et al. 2020).

15.3 Current Genome-Wide Genotyping Approaches for Wheat

The history of speciation, domestication and breeding outlined above has shaped the heritable variation present across the wheat genome. At the DNA level, this variation includes changes to singe nucleotides (single nucleotide polymorphisms, SNPs), or via other rearrangements that typically involve DNA double strand break repair such as DNA insertions or deletions (InDels), gene copy number variation (CNV) and larger chromosomal rearrangements such as translocation and/or inversion of larger tracts of DNA. In the 2000s, advances in wheat research such as the sequencing across multiple tissues, developmental stages and cultivars of complimentary DNA (cDNA) transcribed from messenger RNA (mRNA), and subsequently the availability of genome assemblies for cv. CHINESE SPRING (the wheat reference genome; IWGSC 2018) and 15 additional wheat cvs. (Walkowiak et al. 2020; Chap. 14) (Table 15.1) have led to detailed catalogues of both genic and non-genic DNA variation. Due to their abundance and nature, wheat studies over the last 10Ā years have most commonly assayed genic single nucleotide polymorphisms (SNPs) for use in genetic mapping approaches. Since the publication of the first high-density wheat genotyping array in 2013 capable of assayingā€‰~ā€‰9000 SNPs (Cavanagh et al. 2013), several additional arrays ranging from 3000 to 850,000 features are now available (Table 15.2). While SNP genotyping arrays are relatively simple and cheap to use, one drawback is that only those variants that have been pre-selected to be present on the array can be assayed. Thus, if the SNP identification panel used to design the array does not contain adequate sampling of the variants in the target genepool, useful information on the variation present in a target set of germplasm cannot be adequately assessed. This is a common issue for example in synthetic hexaploid wheat and its derived germplasm, where much of the novel D subgenome variation captured in this germplasm may not be assayed. More recently, reductions in costs have meant that sequencing-based genotyping approaches have become increasingly used in wheat. These include complexity reduction approaches such as genotyping by sequencing (GbyS) (Poland et al. 2012), Diversity Array Technology sequencing (DArTseqā„¢; Sansaloni et al. 2011) and exome and/or promotor capture followed by Illumina short-read (i.e.ā€‰~ā€‰150Ā bp) sequencing (Table 15.2). More recently, whole genome low-coverage sequencing is beginning to be used for genotyping in wheat (Table 15.2) and is considered in more detail in Box 1. Natural variation in the form of InDels and CNV are also relatively abundant in the wheat genome (e.g. Pont et al. 2019; Walkowiak et al. 2020; Wang et al. 2022), and despite the relatively limited number of functionally characterised wheat genes to date (Chap. 9), such variation has been shown to be a relatively common source of functional variation. For example, just within the flowering time pathway, deletions across putative cis-regulatory sites caused by double-stranded DNA break repair via nonhomologous recombination have been shown to result in at least seven functional alleles of the VERNALIZATION1 (VRN-1) flowering time gene homoeologues in hexaploid and diploid wheat (Cockram et al. 2007), while CNV at the PHOTOPERIOD-1 (PPD-1) homoeologues determine flowering time in tetraploid and hexaploid wheat (DĆ­az et al. 2012; WĆ¼rschum et al. 2019; see also Chap. 11).

Table 15.1 Bread wheat cultivars/lines with genome assemblies

PPD-1) homoeologues determine flowering time in tetraploid and hexaploid wheat (DĆ­az et al. 2012; WĆ¼rschum et al. 2019; see also Chap. 11).

Table 15.2 Examples of recent high-density, high-throughput wheat genotyping approaches

Box 1: Wheat genotyping via skim sequencing

As genotyping via genome skim sequencing is typically undertaken at significantly less than 1-times genome-wide sequence coverage per line assayed (termed 1ā€‰Ć—ā€‰), multiple reads at any given chromosomal location are not expected for any single line. Therefore, this approach is suited for experimental populations with defined founders, such that confidence in the DNA variants identified from skim sequence in any one line is achieved via reads obtained from additional lines in the population that carry the same variant. For example, if there are 200 lines in a bi-parental population, with each line sequenced to 0.3ā€‰Ć—ā€‰coverage, we would expect on average 60ā€‰Ć—ā€‰coverage of any single locus, and therefore 30ā€‰Ć—ā€‰coverage of each allele at any bi-allelic locus, i.e. (200ā€‰Ć—ā€‰0.3)/2. Thus, by cataloguing and the SNPs present at good coverage in the population as a whole, the presence of any of these SNPs identified via a single sequencing read in any given line can be called with good confidence. Pre-determining the sequence variants present in the population founders, for example by exome capture or whole genome assembly, may help the process of variant calling and the imputation of variants that are not directly sequenced in any given line. For example, Scott et al. (2020a) sequenced the 16 founders of a wheat multifounder population via exomeā€‰+ā€‰promotor capture identifying 1.13 million SNPs across the 110,790 genes targeted by the capture probes. They then skim sequenced the 501 derived recombinant inbred lines (RILs) at 0.3ā€‰Ć—ā€‰coverage, which directly identifiedā€‰~ā€‰28% of these SNPs (i.e. 1.13 million SNPsā€‰Ć—ā€‰0.3ā€‰=ā€‰339,000 SNPs). SNP imputation in the RILs was then undertaken using the software STICH (Davies et al. 2016), resulting in 94% of the 1.13 million founder SNPs to be called and founder haplotype dosage at each chromosomal location to be assigned for all RILs. Down-sampling the 0.3ā€‰Ć—ā€‰read coverage showed RILs could be accurately inferred from sequence coverage as low as 0.076ā€‰Ć—ā€‰per RIL. Notably, at sequence coverage of 0.076ā€‰Ć—ā€‰and above, imputation accuracy was not dependent on whether or not founder haplotypes were included as a reference panel. This means that accurate RIL haplotype mosaics in the RILs could be achieved without the need to generate data on the 16 founders. In summary, imputation from low-coverage whole genome sequencing of experimental populations represents a relatively straightforward and cost-effective genotyping strategy for bi-parental and multifounder experimental wheat populations and does not suffer from the inherent bias of SNP array genotyping approaches that require the variants targeted to be pre-identified.

15.4 Genetic Mapping Resolution: Population Size, Genetic Recombination and Effect Size

Forward genetic mapping relies largely on the recombination fraction between a QTL and the genetic markers that have been genotyped in the population, and the heritability of the target trait. These considerations are reviewed in more detail elsewhere (e.g. Cockram and Mackay 2018), but in general greater genetic mapping resolution can be attained by increasing population size and/or undertaking additional rounds of crossing. Larger populations also have the benefit of providing greater QTL detection power. Important to consider is the heritability of the target trait and the effect size of the QTL detected. The more heritable a trait is, and the larger its effect size, the easier it is to detect and precisely locate. Indeed, most wheat QTL resolved to the underlying gene level are for highly penetrant major genes, such as gene-for-gene disease resistance loci (e.g. for a recent list of cloned wheat rust resistance genes, see Bouvet et al. 2022b), awn presence/absence (Huang et al. 2020), vernalization response (first undertaken in T. monococcum: Yan et al. 2003; Yan et al. 2004), plant height (Tian et al. 2022) and grain quality (Uauy et al. 2006). If trait heritability is low, phenotypic replication can increase line mean heritability and has been used to refine and update the genetic interval of a locus on chromosome 5A controllingā€‰~ā€‰10% variation for wheat grain size (Brinton 2017; Brinton et al. 2017). Aside from such highly penetrant genetic loci, the genetic architecture of most target traits in wheat is highly quantitative in nature. For example, the mean QTL effect size for grain size traits in wheat is less than 10%, compared to more than 20% in the diploid cereal rice, and is likely due to the buffering effect of homoeologues of overlapping function in hexaploid wheat (Brinton and Uauy 2019).

15.5 Population Types

The identification of functional gene variants via genetic mapping relies on the capture of sufficient genetic diversity and genetic recombination. Fundamentally, two broad experimental population types are employed by researchers interested in identifying genetic loci controlling traits of interest. Both exploit genetic variation, and the reshuffling of this variation via genetic recombination, in order to associate markers or groups of markers (haplotypes, see also Chap. 9) with target traits.

15.5.1 Experimental Populations

Experimental populations are derived from crossing two or more parents to produce progeny in which genetic loci can be identified by the strength of the associations between genetic markers and traits of interest. Examples of some commonly used experimental populations are listed below and are illustrated in Fig.Ā 15.2.

Fig. 15.2
figure 2

Illustration of experimental population and association mapping panel designs. Number of founders illustrated in each panel is indicated in brackets. Dashed lines indicate inbreeding (via single seed descent or doubled haploid approaches) to produce multiple inbred lines. AICā€‰=ā€‰advanced intercross, two rounds of intercrossing illustrated, prior to the production of inbred lines. NAMā€‰=ā€‰nested association mapping. NCIIā€‰=ā€‰North Carolina II model. MAGICā€‰=ā€‰multifounder advanced generation intercross. AMā€‰=ā€‰association mapping population

15.5.1.1 Bi-parental

Bi-parental populations are most commonly used in wheat forward genetics research and are constructed by first crossing two parents to generate first filial (F1) derived progeny lines. Inbred progeny are generated either by single seed descent (whereby individual F2 lines are selfed over three or more generations to achieve acceptable levels of homozygosity genome-wide) or via doubled haploid approaches (where haploid F1-derived gametes undergo chromosome doubling, resulting in completely inbred progeny in a single generation) (Fig.Ā 15.2). Despite DH lines typically taking less time to create compared to RILs, DH populations capture less genetic recombination. This is because additional genetic recombination events can occur between regions of heterozygosity from the F2 generation (25% heterozygous) until effective fixing at around the F6 stage (1.6% heterozygous) or beyond, and which on average is equivalent to one additional round of crossing. Bi-parental populations are now beginning to be constructed from wheat cultivars with genome assemblies, such as the CHINESE SPRINGĀ Ć—Ā PARAGON population (Wingen et al. 2017).

15.5.1.2 Advanced Intercross

Even when bi-parental populations are created via single seed descent, the amount of genetic recombination captured can be relatively low. One way to increase the number of genetic recombinations is to continue random intercrossing of the F2 for one or more generations before the production of inbred lines (Fig.Ā 15.2). Such advanced intercross (AIC) populations (Darvasi and Soller 1995) designs provide greater precision compared to standard bi-parental populations of the same size. For example, Darvasi and Soller (1995) estimated that eight rounds of random intermating would reduce a QTL interval from 20 to 3.7Ā cM. While AIC have been used in species such as Arabidopsis (Fitz et al. 2014) and maize (Balint-Kurti et al. 2008), they have yet to be implemented in wheatā€”likely due to the time required to undertake additional rounds of crossing. However, the advent of ā€˜speed breedingā€™ approaches, that allow the generation time of both spring (Watson et al. 2018) and winter (Cha et al. 2022) wheat varieties to be reduced, means that for primary QTL screens, AIC approaches in wheat should become a more attractive prospect.

15.5.1.3 Near Isogenic Line Pairs, Introgression Lines and Chromosome Segment Substitution Lines

A near isogenic line (NIL) captures a relatively small chromosomal region from one ā€˜donorā€™ parent within the wider genomic context of a second ā€˜recipientā€™ parent (Fig.Ā 15.2). NILs are generated via repeated rounds of backcrossing, often with the use of genetic markers to select for donor at the target chromosomal region, and for the recipient across the remainder of the genome. NILs are commonly used to target specific QTL of interest, allowing the effect of the contrasting alleles captured in the NIL pair to be evaluated using a single pair of lines, rather than a larger population in which additional genetic loci affecting the target trait may be segregating. Following this approach, individual genetic loci controlling a target trait can be investigated in detail, and the underlying physiology and pleiotropic effects on related traits can be assessed. Further, a NIL pair can be crossed to generate further genetic recombination and so further refine the genetic interval. For example, contrasting alleles at a major effect genetic locus for wheat grain weight identified in a bi-parental population of 192 inbred lines was subsequently assessed via phenotypic evaluation of BC2- and BC4-derived NILs, finding theā€‰~ā€‰7% increase in grain weight was (i) mediated predominantly by increased grain length, (ii) the maternal pericarp cell length was longer in the NIL carrying the high grain weight allele, and that (ii) increased grain length was detectable 12Ā days after fertilisation (Brinton et al. 2017). Additionally, the NILs were used to further refine the genetic interval to 4.3Ā cM (Brinton et al. 2017), with further analysis indicating that two genetic loci may be present at the locus (Brinton 2017). A series of NILs that capture chromosomal segments from wild and domesticated wheat relatives is termed introgression lines. Recent work in the UK has generated such germplasm resources for a range of wheat relatives. These include diploid Ae. caudata (CC genome. Grewal et al. 2020), Am. muticum (TT. Coombes et al. 2022), Th. bessarabicum (JJ. Grewal et al. 2018), and T. uratu (AuAu. Grewal et al. 2021), tetraploid T. timopheevii (AtAtGG. Devi et al. 2019), hexaploid Th. intermedium (JJ JvsJvs StSt. Cseh et al. 2019) and decaploid Th. elongatum (EbEb EbEb EbEb EStESt EStESt. Baker et al. 2020), with all introgression lines generated using the recipient wheat cv. PARAGON. When a series of NILs is designed to collectively capture the entire donor background, the resulting resource is termed a chromosome segment substitution line (CSSL) population. In wheat, CSSLs populations that capture novel A, B and D subgenome diversity from wheat relatives have recently been developed using (i) a synthetic hexaploid wheat line (Horsnell et al. 2022) and (ii) a tetraploid T. turgidum ssp. dicoccoides accession (TTD-140). Not only do CSSL populations serve as useful sources of novel variation, they can also be used directly for genetic mapping, as recently illustrated in wheat by Horsnell et al. (2022).

15.5.1.4 Multifounder Populations: NAM

While bi-parental populations and derived NILs had long been the mainstay of forward genetic approaches, multifounder populations have recently become commonplace in plant research (reviewed by Scott et al. 2020b). Multi-parent mapping populations capture more variation than bi-parental populations and increase precision via joint linkage and association analysis. Nested association mapping (NAM) populations represent a series of bi-parental populations (termed ā€˜familiesā€™), each of which has the same parent in common (Fig.Ā 15.2). The first NAM population was made in maize (Zea mays L.) by crossing 25 diverse inbred lines with the inbred line B73 (termed here the ā€˜linkingā€™ founder)ā€”one of the most widely used lines in the history of maize breeding, and the line used for the maize reference genome (Yu et al 2008). Since then, the maize NAM parents have become extensively characterised, including provision of their genome assemblies (Gage et al. 2020). The genetic resolution obtained from NAM populations largely depends on the number of alleles present in the founders and the amount of genetic recombination captured in the progeny. The rarest alleles in any NAM population will be present in half of the progeny from the corresponding family. Therefore, in a NAM with 25 families and 200 progeny per family, rare alleles are expected to be present in 100 of the total 5000 progeny lines, i.e. a frequency of 2%. For NAM design, increasing the number of founders at the expense of family size should be preferable, as the decay of parental linkage disequilibrium for a given allele would likely, on average, be shared among more parents (Gage et al. 2020). NAM populations have now been made in many crop species and can be genetically analysed using association mapping approaches. At least part of the attraction of NAM design is that their composition (a series of bi-parental populations with a common parent) makes them more conceptually familiar to researchers experienced with bi-parental populations. Indeed, once a genetic locus has been identified in a NAM, it is straightforward to continue further analysis using one or more of the relevant constituent bi-parental populations. To date, several NAM populations have been created in wheat (Table 15.3; Fig.Ā 15.3). The founders used include elite cultivars (e.g. Bajgain et al. 2016), genetically diverse landraces (Wingen et al. 2017), as well as germplasm that captures backcrossed chromosomal segments from wheat relatives via synthetic hexaploid wheat and wheat vs tetraploid durum wheat (T. durum ssp. durum) introgression lines. Further, a recent durum NAM has been constructed by crossing 50 durum landraces to an Ethiopian durum cultivar (Kidane et al. 2019). The largest wheat NAM currently available was constructed using 60 inbred worldwide landraces from the Watkins wheat landrace collection, backcrossed to the spring UK cultivar PARAGON, generating a population of 1192 RILs and a mean of 105 RILs per family (Wingen et al. 2017). Therefore, the rarest allele captured in the Watkins NAM would be expected to be present in 4% of the populationā€”a frequency nominally sufficient for detection via genetic analysis.

Table 15.3 Examples of wheat multifounder populations and association mapping panels
Fig. 15.3
figure 3

Features of existing wheat nested association mapping (NAM) populations, comparing mean NAM progeny per component bi-parental population (x-axis) with the number of NAM founders (y-axis) and the size of the resulting population (proportional to the size of the circle)

15.5.1.5 Multifounder Populations: North Carolina II Model

A notable limitation of NAM populations is that while multiple founders are employed, a single ā€˜linkingā€™ parent is used with which to cross to. The North Carolina II (NCII) design of Comstock and Robinson (1952) is conceptually an extension of NAM, whereby two or more ā€˜linkingā€™ parents are used such that every progeny family has half-sib relationships both through a common mother and through a common father (Fig.Ā 15.2). Similarly, any combination of populations with founder links between them can be analysed together to undertake genetic analysis and to increase power and precision by increasing sample size (Cockram and Mackay 2018). However, such populations are more commonly used to confer detection of QTL in different genetic backgrounds and on the analysis of epistasis.

15.5.1.6 Multifounder Populations: MAGIC

While NAM and NCII populations capture more diversity than bi-parentals, they capture no additional genetic recombination than bi-parental populations of the same size. Since its pioneering use in mouse in 2002 (The Complex Trait Consortium 2002), the multi-parent advanced generation intercross (MAGIC) design has been applied to many crop species (Scott et al. 2020b). To aid crossing design, MAGIC populations typically use 4, 8 or 16 founders. However, unlike NAM or NCII populations, all MAGIC founders are intercrossed over multiple rounds of crossing to produce progeny that capture equal proportions of each founder genome (Fig.Ā 15.2). Thus, MAGIC combines the benefits of increased genetic diversity afforded by NAM and NCII, with increased amounts of genetic recombination afforded by AIC, while minimising population structure via controlled crossing. In contrast to bi-parental populations, which are typically constructed to target a single target trait and are relatively quick to generate, MAGIC populations aim to capture and recombined multiple alleles across the genome and therefore take much longer to create. However, once complete, MAGIC, as well as other multi-parent populations, are well suited as community resources. In wheat, six MAGIC populations have been published, the first of which was the Australian spring wheat 4-parent MAGIC (Huang et al. 2012). Since then, four additional MAGIC populations have been created: 8-parent populations from Australia (Shah et al. 2019), the UK (Mackay et al. 2014) and Germany (Sannemann et al. 2018; Stadlmeier et al. 2018), as well as a 16-parent European wheat MAGIC (Scott et al. 2020a) (Fig.Ā 15.4). Additionally, a MAGIC-like wheat population made between one male-sterile line crossed and backcrossed with 59 European/worldwide lines, followed by 12 generations of random intermating, has been generated (ThĆ©pot et al. 2014). To date, the 8-founder NIAB Elite MAGIC population likely has the most publicly available resources available, including the population and associated 90Ā k array SNP data (Mackay et al. 2014) and genetic map (Gardner et al. 2016), genome assemblies for two of the founders (Walkowiak et al. 2020), and phenotypic and genetic data for numerous traits including disease (Bouvet et al. 2022a, c; Corsi et al. 2020; Lin et al. 2020a; Riaz et al. 2020) flowering time (Wittern et al. 2022), canopy architecture (Zanella et al. 2022), ear architecture (Dixon et al. 2018), end-use quality and mineral content (Fradgley et al. 2022a). Additionally, BLAST access to the genome assembles for the remaining six founders is currently available ahead of publication (https://www.cropdiversity.ac.uk/8magic-blast/) and release of whole genome skim sequencing data for the RILs is imminent (J Cockram personal communication).

Fig. 15.4
figure 4

Crossing diagram illustrating the founder selection and pedigree of the wheat 16-parent ā€˜NIAB Diverse MAGICā€™ population. The red and blue lines each track the pedigree of a single recombinant inbred line (RIL) through the pedigree

15.5.2 Founder Selection

Founder choice in any structured population is one of the first decisions addressed and depends to some degree on population type. For a bi-parental population, founders that contrast for a specific trait of interest are typically selected. In some cases, selection criteria will also include selection for specific traits that may otherwise confound the target phenotype. For example, founders with similar ear emergence date may be selected to avoid pleiotropic effects on diseases such as Fusarium head blight that affect the wheat ear. However, the differential presence of alleles of contrasting effect between founders may mean that while the parents may have been selected for similar phenotype, segregation for the phenotype may still be observed in the progeny. For NAM and MAGIC populations, founders should generally be selected to maximise genetic diversity, particularly in those designs that include larger founder numbers. For NAM populations, the selection of the ā€˜linkingā€™ founder is notably important as each progeny line will sample 50% of its genome, and its genome will be highly represented in the population. ā€˜Linkingā€™ founders typically represent a line which has been particularly well characterised, or is common in the wheat pedigree within the target geographical region. For example, the cultivar PARAGON has been selected as the ā€˜linkingā€™ founder in three wheat NAM populations: Watkins-60 (Wingen et al. 2017), NIAB SHW and NIAB TetHex (data repository at https://niab.github.io/niab-dfw-wp3/). PARAGON is a spring UK variety released in 1988 which has a sequenced genome (Walkowiak et al. 2020), RNA sequence (RNA-seq) data from multiple tissues and a gamma-irradiated series of deletion lines (available via https://www.jic.ac.uk/research-impact/germplasm-resource-unit/). Similar considerations apply to the selection of linking founders in NCII population designs, although as two or more such founders are used, more flexibility is afforded.

If the aim of the population is to generate data under field conditions, founders should be suited for growth in the environments under which they will be phenotyped. When constructing populations using elite varieties, this should be relatively straightforward. For example, in the NIAB Diverse MAGIC population, the 16 founders were selected to sample maximum genetic diversity across a wider collection of 94 European winter wheat cultivars released over a 70Ā year period, for assessment under UK field conditions (Scott et al. 2020a). However, for populations that capture variation from landrace or species related to wheat, especially if these donors originate from geographic areas distant to the target environment, adaptability of the resulting populations to local field environments could be more problematic. In bi-parental or NAM populations, one way to address this is to generate populations from backcross-1 (BC1) generation (where each progeny line contains on average 25% of the non-recurrent founder genome) or beyond, rather than from the F1 which is expected to contain 50% contribution from each founder. This approach is logistically harder, as it involves an additional round of crossing and requires more progeny than an F1-derived population to effectively sample non-recurrent founder genome. However, if the aim is to generate phenotypic data under field conditions, such approaches may be beneficial. For MAGIC designs, as each progeny line represents a balanced genomic mosaic of all founders, the inclusion of one, or possibly more, ā€˜wilderā€™ founder genomes is slightly less problematic. For example, in an 8-founder MAGIC which includes one ā€˜wilderā€™ founder, each RIL would be expected to contain a 1/8th genomic contribution from the ā€˜wilderā€™ founder. While no such MAGIC populations have been constructed to date in wheat, the most diverse is the INRA MAGIC-like population developed using one male-sterile line (cv. PROBUS) crossed and backcrossed with 59 European/worldwide lines before 12 generation of random intermating to generate 1000 lines (ThĆ©pot et al. 2014). Finally, for all population designs, it may be useful to consider the size and extent of any genomic rearrangements (e.g. the chromosome 5AL/7AL translocation Walkowiak et al. 2020) or chromosomal introgressions from wheat relatives, as their presence is likely to disrupt local genetic recombination rates. While such regions may specifically be sought, for example the Ae. tauschii (D) and T. durum ssp. durum (AB) genomic contributions captured in the NIAB SHW NAM, it is possible that one or more founders are unintentionally selected that contain such features. For example, in the 16-founder NIAB Diverse MAGIC population, cv. MARIS FUNDIN carries a large introgression of 540Ā Mb from T. timopheevi on chromosome 2B which is substantially over-represented in the MAGIC progeny (Fig.Ā 15.5) (Scott et al. 2020a). Segregation distortion due to introgressions was also identified in the 8-founder NIAB Elite MAGIC, for example due to the chromosome 1B/1R wheat/rye introgression in cvs. BROMPTON and RIALTO and the presence of an introgression on the long arm of chromosome 4A in cv. ROBIGUS (Gardner et al. 2016).

Fig. 15.5
figure 5

ā€‰~ā€‰540Ā Mb chromosome 2B introgression from T. timopheevi present in the NIAB Diverse MAGIC founder MARIS FUNDIN, as identified by analysis of exome-promotor capture sequence data of the 16 founders. The introgression is visualised here by the increase in non-reference (relative to chromosome 2B IWGSC RefSeq v1.0, cv. CHINESE SPRING) SNP variants (top) and as reduced sequence coverage (bottom) in MARIS FUNDIN, compared to the remaining 15 founders. Scott et al. (2020a) find the introgression to be substantially over-represented in the MAGIC progeny

15.5.3 Association Mapping Panels

The experimental populations described above take time to construct. However, it is possible to exploit the genetic variation and historical genetic recombination captured in existing collections of wheat varieties, landraces or accessions (Fig.Ā 15.2). Such association mapping approaches aim to locate QTL based on the strength of the association between genetic markers and the target trait(s) and rely on the decay of linkage disequilibrium between markers and QTL over genetic distance (Cockram and Mackay 2018). Genetic analysis of association mapping panels can be conducted using markers from candidate genes, or from across the genome using a whole genome association scan (GWAS) approach. Most commonly, single markers are regressed against the target trait. However, power can be increased by constructing haplotypes from the genotypic allele calls of two or more genetic variants that are closely physically or genetically linked within a defined region (haploblock). Use of haplotypes in GWAS can improve the estimation of allelic effects and increase statistical significance and is increasingly used in wheat. For example, linkage disequilibrium approach to defining haploblocks in a panel of 6333 wheat lines genotyped with 14,027Ā GbyS genetic markers resulted in the identification of 537 genome-wide haploblocks for downstream GWAS of grain yield (Sehgal et al. 2020). Alleles present at a frequency of less than 5% within the panel will typically not be detected, even if these alleles have relatively high effect sizes and/or the causative polymorphism is assayed. In human genetics, approaches that help identify rare alleles in GWAS are increasingly being used (reviewed by Lee et al. 2014), such as aggregation tests that evaluate cumulative effects of multiple genetic variants in a gene or region. The ability to generate experimental populations in plants means that such approaches are not as necessary to explore.

Unlike the case in most experimental populations in which allele frequency is relatively equally distributed among the progeny, association mapping panels are often characterised by notable levels of population substructure or subdivision. This is due to the differences in the shared ancestry of the lines over time, due to non-random mating. In cereal crops, population structure commonly arises from (i) physical separation, i.e. (geographic location), (ii) the contrasting germplasm preferences within different breeding companies, (iii) seasonal growth habit (i.e. spring or winter-sown) and (iv) traits underlying end-use quality (such as malting or feed in barley, or bread making versus in wheat) (Cockram et al. 2010; White et al. 2022) and yield (Sharma et al. 2022). For example, while relatively few major genetic determinants control the spring versus winter phenotype (Bentley et al. 2013), the common practice that spring cultivars are typically bred from other spring lines, while winters are bred from winters means that any genetic variants present at notably different frequencies between these two germplasm pools continue to show skews in their frequency in progeny lines. Thus here, if a favourable allele controlling a trait of interest happened to segregate predominantly in the spring pool, then the population structure inherent within spring varieties may lead to false-positive genotype-trait associations (termed Type-I errors) that are not due to close linkage of markers with the underlying QTL. It is possible to control statistically for population structure (Q) by using genetic markers to determine a Q-matrix of population membership estimates for each accession in the panel. Q-matrices can be determined using programmes such as STRUCTURE (Pritchard et al. 2000) or via principal component analysis (Zhao et al. 2007). Additional correction for more recent similarities due to close kinship (K) can also be included and can be determined using genetic markers. Indeed, approaches such as the Qā€‰+ā€‰K mixed model (Yu et al. 2006) that account for multiple levels of relatedness between individuals have been shown to control well for false-positive as well as false-negative (Type-II error) associations and often lead to higher power than correction via Q or K alone (Yu et al. 2006). However, accounting for population structure/kinship sacrifices some level of experimental power to detect those genetic loci that are correlated with the adjustments made. Nevertheless, power and precision to detect genetic loci in association mapping panels can be high, compared to experimental populations of the same size. While improved power can be achieved by increasing the number of individuals in the panel, the inclusion of additional accessions may increase population substructure and/or kinship. Similarly, linkage disequilibrium may decay quite slowly in with genetic distance in cultivars (due to close kinship among all lines), which will reduce the precision to detect QTL (Cockram and Mackay 2018) but will increase power. Conversely, linkage disequilibrium in panelā€™s landraces is typically higher, enabling greater genetic mapping precision. Genotyped wheat landraces collections are now available that sample diversity with single countries (e.g. China, Zhou et al. 2017) or from around the worldā€”such as the Watkins (Wingen et al. 2014) and Vavilov collections (Riaz et al. 2018). These are beginning to be used for GWAS of agronomic traits, such as disease resistance (tan spot, Halder et al. 2019; leaf rust. Riaz et al. 2018, stripe rust, Jambuthenne et al. 2022) and pre-harvest sprouting (Zhou et al. 2017). Given the multiple variables affecting GWAS in association mapping panels, it is useful to determine the efficacy of experimental design by undertaking power calculations, especially if population size is relatively small (e.g. White et al. 2022).

15.6 Reverse Genetics Germplasm Platforms

Functional validation of genes genetically mapped using experimental or association mapping populations can be undertaken using reverse genetics approaches. Transgenic methods aim to alter gene expression or function, typically via gene overexpression, gene silencing or gene editing (reviewed in wheat by Adamski et al. 2020). Alternatively, non-transgenic reverse transgenics approaches are available that exploit genetic variation induced by mutagenizing agents. In wheat, the most commonly used are Targeting Induced Local Lesions in Genomes (TILLING) populations, created by using an inbred donor line (termed the M0 generation) and applying the chemical agent ethyl methanesulphonate (EMS). The resulting EMS treated seed is termed the M1 generation, which can be subsequently selfed over several generations to generate a population of TILLING lines in which the EMS-generated mutations become progressively fixed in homozygous state. Bespoke experiment-specific TILLING populations are frequently used to determine genes underlying traits controlled by single major effect genes, such as gene-for-gene disease resistance. In such cases, a wheat line for which resistance to the target disease is controlled by a single major effect locus is mutated, and susceptible TILLING lines identified phenotypically. Assuming the underlying gene can be sequenced, relatively low numbers of TLLING lines with independent mutations at the target locus are generally sufficient to give a high statistical probability of identifying the causative gene. For example, SĆ”nchez-MartĆ­n et al. (2016) estimated that the probability that the 12Ā kb gene containing contig of their target wheat gene (Pm2 conferring resistance to powdery mildew) being mutated across all 12 identified powdery mildew susceptible TILLING mutants was 1 in 300,000,000,000. Several approaches to applying DNA sequencing to such gene identification approaches have been published: the first uses exome capture of pre-determined candidate gene families (termed resistance-gene enrichment sequencing, RenSeq, when applied to NRL disease resistance gene families; Jupe et al. 2013). The second, termed MutChromSeq, involves flow sorting and direct sequencing of the target chromosome in each of the phenotypically identified TILLING lines (SĆ”nchez-MartĆ­n et al. 2016). In addition to such experiment-specific TILLING resources, exome capture followed by DNA sequencing of large numbers of TILLING lines generated from the spring bread wheat cv. CADENZA (1200 lines) and the tetraploid wheat cv. KRONOS (1535 lines) have been made publicly available (Krasileva et al. 2016). The resulting TILLING mutations have been aligned against the bread wheat reference genome of cv. CHINESE SPRING (RefSeq v1.1; IWGSC 2018) and searchable via the Ensembl plants (Cunningham et al. 2022) genome browser. The effects of mutations on protein sequence have been predicted in relation to CHINESE SPRING gene models, with deleterious mutations determined to be present in around 90% of the captured genes. The ability to identify and prioritise TILLING mutants in silico means these resources serve as useful genome-wide resources for gene functional validation in wheat. Considerations for the identification and validation of wheat TILLING mutants in the CADENZA and KRONOS populations are listed in more detail by Adamski et al. (2020) and include the need to combine TILLING mutants in multiple homoeologues to overcome possible functional redundancy as well as the need to undertake sufficient rounds of backcrossing to remove background mutations. Examples of their use for gene functional characterisation include (i) wheat candidate genes orthologous to map-based cloned gene from model species (e.g. TaGRAIN WIDTH2, Simmonds et al. 2016), (ii) wheat genes identified via forward phenotypic screening followed by bulk segregant analysis of backcross derived progeny between mutant line and wild-type (e.g. HOMEOBOX DOMAIN-2, Dixon et al. 2022) and (iii) candidate genes underlying wheat genetic loci previously refined by fine-mapping (e.g. WHEAT ORTHOLOG OF APO1, Kuzay et al. 2022; EARLY FLOWERING 3, Wittern et al. 2022). While the ability to screen in silico the cv. CADENZA and KRONOS TILLING populations provide proven community resources for gene functional characterisation, they can only be used for those genes present in the two founding cultivars used. The availability of annotated genome assemblies for multiple wheat varieties now provides the underpinning knowledge from which it may in future be possible to develop additional sequenced TILLING resources that target genes not captured in cv. CADENZA and KRONOS.

15.7 The Future of Genetic Recombination

Genetic recombination in wheat is enriched in the telomeric regions and becomes progressively less frequent towards the pericentromeric and centromeric regions, with 80% of recombination events occurring in less than a quarter of the genome (e.g. Gardner et al. 2016; IWGSC 2018). As genetic mapping relies on the occurrence of recombination, being able to increase recombination at chromosomal regions of interest would help both genetic mapping precision, and the ability to recombine different haplotypes in breeding. Analysis of crossover events in RIL populations has identified QTL for genetic recombination frequency, such as a locus on chromosome 6A in the CHINESE SPRINGā€‰Ć—ā€‰PARAGON population controlling around 6% of the variation (Gardiner et al. 2019). Further, recent work shows that recombination events in wheat pericentric regions can be increased in some chromosomes by increasing temperature during meiosis (Coulton et al. 2020), although this does come with reduced fertility (Draeger and Moore 2017). Transgenic approaches for altering genetic recombination rates and locations are also now being investigated. For example, transient virus induced gene silencing (VIGS) of wheat candidate genes homologous to genes in other species shown to control genetic recombination shows it is possible to alter the distribution of recombination along chromosomes (Raz et al. 2021). VIGS silencing of the durum wheat homologue of the anti-cross over gene XRCC2 (a paralogue of RAD51) in F1 plants ahead of meiosis resulted in increased genetic recombination across much of the pericentromeric region of chromosome 4B, as well the more distal pericentromeric regions of chromosome 5B (Raz et al. 2021). Such results indicate that it should be possible to increase genetic recombination in at least some of the pericentromeric landscape of wheat. The maturation of gene editing methodologies may soon enable the targeting of cross-overs and genetic recombination to more specific genomic locations.

15.8 Conclusions

In parallel to the efforts to provide wheat genomic and genotyping tools, the wheat community has generated extensive resources to support genetic locus and gene characterisation via forward and reverse genetics approaches. For highly penetrant wheat genetic loci originating from natural variants or via induced mutation, and where phenotype effectively acts as a genetic marker, various routes have been used to identify the underlying genetic loci, including fine-mapping in bi-parental derived germplasm, as well as reverse genetics approaches such as RenSeq and MutChromSeq where the identification of multiple independent alleles rather that genetic recombination is required. For genetic loci of a more quantitative nature, to date it is those which account for an unusually high proportion of genetic variation that have been fine-mapped or map-based cloned, using bi-parental populations and also more recently via multifounder populations. The vast majority of remaining heritable variation in the wheat genepool is much more quantitative in nature, typically accounting for 3ā€“5% of the phenotypic variation. For such loci, including those located in genomic regions with very low genetic recombination, identification of the underlying genes and variants via forward mapping approaches will continue to pose a challenge. However, genetic mapping approaches will allow their alleles and linked haplotypes to be determined, and increasingly, for the epistatic non-additive interaction effects of these loci to be characterised. For wheat breeding, advances in our knowledge of genetic loci and gene function will best be exploited within a quantitative genetics framework (Mackay et al. 2021). Trait improvement in the context of breeding over the next decade will likely focus on integration of multi-trait ensemble phenotypic weighting approaches (e.g. Fradgley et al. 2022b) combined with improved genomic selection methodologies and field-based phenotyping at increasing throughput and precision. The next decade will likely also see the maturation of approaches to engineer increased genetic recombination, and to design via gene editing new alleles with improved function. Finally, computer vision, artificial intelligence and machine learning and approaches are now maturing to the point at which they can more readily be applied to complex challenges such as crop phenotyping and plant breeding. Such approaches need to be efficiently combined to underpin future breeding for improved crop performance, quality and resilience.