1 Learning Objectives

  • To provide background information on population improvement and selection methods in relation to wheat breeding.

  • To facilitate critical thinking around roles of population improvement and selection methods in the design of wheat breeding programs.

2 Population Improvement

Chapter 5 introduced various line development methods in the context of wheat breeding programs. While representing a key component of the breeding pipeline, line development alone will not lead to the production of successful varieties year after year. To achieve ongoing variety improvement, a population improvement strategy should be implemented in order to enhance the entire genetic base of the breeding program. Recurrent selection is the predominant approach to population improvement in wheat breeding, but evolutionary breeding has also received some attention as an alternative approach.

2.1 Evolutionary Breeding

The idea for evolutionary plant breeding [1] grew out of the desire to improve the efficiency of the bulk breeding method. The evolutionary breeding method involves the bulking of F1 progeny followed by many generations of prolonged natural selection (and incidental artificial selection) in successive natural environments. The concept is simple: let nature select the best adapted genotypes over time, thus minimizing the effort required for traditional selection and testing of individual genotypes.

The exploitation of natural selection represents a similarity between the bulk and evolutionary breeding methods, but there is an important distinction. Bulk breeding is considered a line development method because it is used to inbreed a population during a limited number of generations of self-fertilization following the initial cross until a desired level of homozygosity is reached. Conversely, evolutionary breeding represents a population improvement strategy because it relies on natural outcrossing and selection to generate new genetic combinations and lead to incremental improvement in the population over time. Natural selection acts on these heterogeneous mixtures over many generations to produce populations with superior environmental adaptation.

As with bulk breeding for line development, this reliance on natural selection provides a logistical advantage to the population improvement process, but a potential disadvantage is that nature may select for traits that are undesirable in agronomic settings. For example, when evolutionary breeding populations are replanted, a subset, typically 1/30th to 1/50th or less, of the seed is sampled for use in the next generation. The plants producing the most seeds have the greatest likelihood of representing the next generation, and finite resources available for reproduction may produce a negative correlation between seed size and seed number. Another example is undesired selection for increased lodging. Taller plants may shade out shorter plants, thereby producing more seeds, but tall plants may be prone to lodging.

Most empirical studies testing the utility of the evolutionary breeding method were performed in barley. Hockett et al. [2] evaluated agronomic traits in Composite Cross II (CCII), which was developed by crossing 28 barley lines and sowing the subsequent generations under natural selection. The resulting populations were found to have higher yield than a mixture of the original parents, suggesting that improvement due to natural selection had been made. However, various CCII populations which had been developed in different environments were not significantly different from one another, even when tested in the environment in which they were developed. This suggests that natural selection had not produced local environmental adaptation. Furthermore, contemporary varieties yielded significantly more than the best CCII populations, suggesting that greater genetic progress had been made from conventional breeding methods.

Heterozygosity is expected to be reduced by half in each successive generation after an initial cross is made to a produce segregating population or after a natural outcrossing event. However, empirical studies have demonstrated that natural selection can sometimes preserve the level of heterozygosity in the population when there is an advantage for the heterozygote with respect to relative fitness. For example, Hockett et al. [2] observed a high level of variability remaining in the F19 generation of CCII, which authors attributed to the adaptive advantage of the heterozygote. The promotion of outcrossing can further maintain or increase heterozygosity in the population.

An important disadvantage of evolutionary breeding as a population improvement approach is the amount of time required to observe a benefit. Following an initial cross or crosses, long-term progress in evolutionary breeding populations is dependent on natural outcrossing, which occurs at relatively low levels in wheat. This disadvantage can be partially offset by the ability to plant and maintain several populations simultaneously, though it is difficult to predict which populations will produce superior progeny, and this will not be known for several years. Promoting greater levels of outcrossing may further expedite population improvement.

2.2 Recurrent Selection

Recurrent selection, reviewed by Rutkoski [3], is a population improvement method which aims to enhance the breeding population as a whole through crossing and recombination. Compared with evolutionary breeding, which largely relies on natural outcrossing, recurrent selection more readily facilitates recombination through successive intermating, and incremental genetic improvement occurs in the population over time as the frequency of favorable alleles increases. The recurrent selection process is a cycle that consists of four sequential activities: (1) crossing to recombine breeding materials; (2) generation of new breeding individuals which are non-inbred plants, families, or inbred lines; (3) evaluation of the breeding individuals by phenotyping and/or genotyping; and (4) identification of the best breeding individuals to use for the next round of crossing. Selection may be imposed on a single quantitative trait of interest or on an index of multiple traits (see Chap. 32). Given this process, the breeding population mean is expected to improve by R= krxgσg/L units each year where k is the selection intensity in standard deviation units, rxg is the accuracy of selection, σg is the genetic standard deviation, and L is the duration of the breeding cycle defined as the time which elapses between crossing and evaluation of the derived progeny.

In its simplest form, recurrent selection in wheat consists of selecting and intermating (i.e. ‘recycling’) the best new breeding lines developed by the program each year to generate F1s that then enter the line development process. Because the breeding germplasm is improving, a different set of ‘best’ breeding lines will be selected for intermating each year. The rate of improvement each year is heavily affected by the length of the breeding cycle, with shorter breeding cycles leading to faster rates of gain. As a way to shorten the breeding cycle, recurrent selection in wheat can be imposed among S1 families (equivalent to F2 families) based on phenotypic or genomic selection [4, 5]. Each cycle, individual plants can be selected out of rapid recurrent selection populations and put through a line development process like bulk or SSD to generate lines for further testing as potential varieties. For traits like yield, which should be evaluated as a uniform stand to eliminate intra- and inter-genotypic competition effects, genomic selection [6] (see Sect. 6.3.4) is a promising solution. A recent simulation study demonstrated that, assuming a fixed budget, rapid genomic recurrent selection in winter wheat has the potential to increase the rate of genetic gain for yield by nearly 2.5-fold compared to a conventional breeding strategy [7].

Generating crosses in wheat involves a labor-intensive process of removing the anthers within each floret with forceps or scissors prior to anthesis and later introducing pollen from another plant. The reproductive biology of wheat therefore acts as a limiting factor to rapid cycling for recurrent selection. Male sterility greatly facilitates outcrossing between individuals, and both recessive and dominant male-sterility genes are available in wheat. Figure 6.1 shows several recurrent selection schemes using a dominant male-sterile gene. Those schemes include phenotypic recurrent selection (method A), variations of half-sib selection (methods B, C, and D), S1 selection (method E) and combined S1 and half-sib selection (method F). Line development methods can be readily applied to dominant male-sterile populations to obtain pure lines. Selection of male-fertile plants provides F2 progeny that will breed true for fertility and can be used directly in bulk, SSD (single seed descent method), DH (Doubled-haploids), or backcross breeding line development schemes.

Fig. 6.1
figure 1

Recurrent selection schemes using a dominant male-sterile allele. (Reprinted with permission from Ref. [8])

3 Selection Methods

Selection in wheat breeding programs may be imposed based on rudimentary visual assessments, sophisticated genomic prediction models, or anything in between. Although the appropriate selection method for a given situation depends on the traits of interest, resources available, and the breeding germplasm, results of many years of research in this area have shed light on which are the most promising methods for wheat breeding.

3.1 Mass Selection Systems

Mass selection involves selection among single plants based on their single-plant phenotypes, and it is a common practice during line development via pedigree or bulk methods and male sterile-facilitated recurrent selection. Although mass selection is a relatively popular selection method because it is simple and inexpensive to implement, the merit of mass selection in wheat breeding has been controversial. An important limitation on the effectiveness of mass selection is the heritability of the trait of interest. The genetic gain expected from a single generation of mass selection is khσg, where k is the selection intensity, h is the square root of the heritability, and σg is the square root of the additive genetic variance. For quantitative traits like yield, h on a single-plant basis will be low, although as long as h is not zero, some gain from selection is expected. Another major criticism of mass selection is that most traits of interest cannot be measured on single plants in realistic production conditions. Thus, the trait under selection will likely be a correlated ‘secondary trait’, and selection will be indirect. The relative efficiency of indirect versus direct mass selection is kjrghj/kihi, where ki is the selection intensity of the trait of interest, hi is the square root of the heritability of the trait of interest, kj is the selection intensity of the secondary trait under selection, rg is the genetic correlation between traits i and j, and hj is the square root of the heritability of the trait under selection. Thus, if the secondary trait under selection has a heritability level similar to the primary trait, indirect selection will be even less effective than direct selection with equal selection intensities.

Empirical studies evaluating the effectiveness of mass selection in wheat tend to confirm what would be expected based on theory. In general, direct mass selection tends to be effective, although the effect is often small. For example, Redden and Jensen [9] found that direct mass selection for early tillering, a low heritability trait, was effective, and more cycles of selection and intermating were associated with greater genetic gain. Mass selection for traits that cannot be measured on single plants must be done through indirect means. For example, grain yield measured on individual space-planted plants is an indirect measure of grain yield measured in uniform plots under realistic planting densities. The success of indirect mass selection will depend on the nature of the primary and secondary traits and the genetic correlation between them; therefore, results are expected to vary widely. An experiment evaluating mass selection for grain size using mechanical sorting [10] found that selection was effective at improving grain yield per spike and kernel weight, however improvement in grain yield was not evaluated. Thakare and Qualset [11] evaluated indirect selection for yield and found that visual selection of single plants was effective. However, random selection was also similarly effective at improving grain yield in this experiment, indicating that inadvertent or natural selection may have been the cause of the observed yield gain rather than visual selection. Grid selection, or subdividing the field into a grid and performing selection within the grid units, can improve mass selection by reducing the environmental variation among plants in close proximity [12]. Indirect mass selection for grain yield based on yield measurements taken on single, widely-spaced plants in a honeycomb pattern has been found to produce small but significant genetic gains [13].

Although the effectiveness of mass selection has been clearly demonstrated for some traits and populations, the more important question is how well mass selection performs compared to alternative methods on a gain per unit time and cost basis. The major drawback to mass selection is that when performed in pedigree or bulk breeding schemes, as it is often done, mass selection can be at odds with rapid generation advancement, which can have an impact on increasing rates of genetic gain through reducing the breeding cycle duration. Thus, the benefit of mass selection may not be large enough to outweigh its opportunity cost if it precludes reducing the breeding cycle duration. Mass selection in rapid cycle recurrent selection is much more promising; however, it is worth remembering that it is often not possible to measure and select for all traits of interest on a single plant.

3.2 Selection Based on Best Linear Unbiased Prediction (BLUP)

While mass selection can be inexpensive and feasible for large numbers of candidates, selection based on data replicated within and/or across environments is more accurate, especially for low heritability traits like grain yield. Typically, wheat breeding programs phenotypically evaluate inbred lines replicated within and across environments, and phenotypic observations on lines are often combined using simple arithmetic means or least-squares. However, the statistical procedure Best Linear Unbiased Prediction (BLUP) is the most effective approach to combine multiple phenotypic observations on a breeding individual into a single value which represents its genetic value. These estimates of individuals’ genetic values based on BLUP are referred to as ‘BLUPs’ or ‘random effects’. The BLUP procedure was developed by Henderson [14] for animal breeding as a way of maximizing selection accuracy, and therefore genetic improvement, given all data available while also accounting for non-genetic effects such as effects of environment. BLUPs are referred to as ‘shrinkage estimators’ because they are compressed towards zero depending upon the degree of uncertainty in the estimate. For example, BLUPs for individuals which appear outstanding based on very little data will be shrunk heavily towards the population mean to reflect what is more likely to be their true genetic value. In contrast, BLUPs for individuals which appear outstanding based on many observations within and across environments will be shrunk towards the population mean only very slightly since their performance is known with a high degree of certainty. In this way, it is possible to accurately rank and compare individuals that have been evaluated over different numbers of environments and/or replications.

Another major advantage of BLUP is that it supports the utilization of data from multiple traits and/or multiple environments and on related individuals in an optimal way through pedigree BLUP and multi-trait pedigree BLUP [15, 16] or genomic BLUP [17] and multi-trait genomic BLUP [18]. The latter are commonly used for genomic selection models. A useful feature of multi-trait BLUPs is that they can be multiplied by a vector of economic weights to easily estimate an optimal selection index [19]. While the use of BLUP for selection is not controversial, many research groups have empirically demonstrated the effectiveness of BLUP methods for wheat breeding, especially for selection based on yield in multiple environments [20]. With the rise in popularity of genomic selection in recent years, BLUP methods are becoming increasingly more important in wheat breeding.

3.3 Marker-Assisted Selection

Marker-assisted selection (MAS) is based on the premise that selection based on DNA markers can be more effective or efficient than selection based on phenotypes. Here we introduce conventional MAS, which includes MAS methods other than genomic selection, while more in-depth coverage of MAS can be found in Chap. 28. Conventional MAS in plant breeding, reviewed by Collard and Mackill [21], involves (1) detecting diagnostic markers closely linked to genes affecting the traits of interest; (2) validating those markers in the germplasm where MAS is to be applied; and (3) routine selection based on the validated markers during the breeding process. MAS was a revolutionary idea because it implied that breeders could select on alleles directly without phenotyping [22]. Although MAS never replaced phenotyping and conventional selection methods, it now plays an important role in backcross introgression, gene pyramiding and line development in wheat breeding.

The earliest use of molecular markers in plant breeding was in the backcross method. The ability to identify recombinants close to one or more genes or QTL (Quantitative trait loci) from a donor parent and to simultaneously select for the elite background genotype transformed molecular markers into a modern breeding approach for crop improvement. This strategy, referred to as ‘marker-assisted backcrossing’ (MABC), greatly improved the efficiency of backcrossing alleles that are recessive, epistatic, or affecting traits that cannot be easily measured on a single plant basis.

Tanksley et al. [22] first proposed the use of markers in backcrossing to introgress target genes of interest and select for the genome of the recurrent parent. Hospital et al. [23] was among the first to investigate, through simulation, the optimization of molecular markers to simultaneously perform ‘background selection’, where selections are made against donor alleles at non-target loci, and ‘foreground selection’, in which the target loci are selected. The authors evaluated variables such as time and intensity of selection, population size, and number and position of markers and found that MABC led to a gain of about two generations to recover the recurrent parent genome compared to conventional backcrossing without the use of markers. This reduction in the amount of time required for backcrossing is substantial when considering that MABC can be conducted year-round in the greenhouse because phenotyping is not necessary. Their simulation also showed that three markers per non-carrier chromosome (100 cM) were adequate to select for the elite background genotype in early generations because few recombination events have occurred. In later generations, most of the recurrent parent genome has been recovered, so few donor parent segments remain to be eliminated.

Empirical studies have demonstrated that MABC is effective for wheat breeding for traits that are conferred by few large-effect loci. For example, Randhawa et al. [24] used MABC with foreground and background selection to introgress a yellow rust (Puccinia striiformis f. sp. tritici) resistance gene into an elite background in only two backcross generations while recovering 97% of the recurrent parent genome. An important consideration for the application of MABC is that it can be less effective than phenotypic selection if the trait of interest provided by the recurrent parent is conferred by multiple QTL that are not tightly linked to the markers used for foreground selection. Furthermore, as discussed in Chap. 5, backcrossing is a conservative breeding strategy because it cannot not produce lines that are superior to the best parent for quantitative traits such as grain yield. Thus, the cultivars that result from backcrossing may be difficult to commercialize unless the trait introgressed is of high economic value.

In addition to MABC, gene or QTL pyramiding using markers has been proposed in order to achieve an ideal genotype containing two or more genes or QTL originating from different parents. The simplest pyramiding strategy, demonstrated by Liu et al. [25] with two powdery mildew (Erysiphe graminis f. sp. tritici) resistance genes, relies on crossing two near isogenic lines (NILs), where each NIL contains different alleles in the same genetic background. The resulting genotypes are then self-pollinated, and lines homozygous for both genes are selected. Unfortunately, for most desired gene pyramids, appropriate NILs are not readily available. In this case, a crossing and selection strategy which combines genes or QTL from two different parents is needed. Obtaining the desired gene or QTL combination in early generations requires large population sizes. Selection among inbred lines can reduce the number of lines needing to be screened; however, resources must be spent to generate inbred lines that will ultimately get discarded. Bonnett et al. [26] discusses strategies to efficiently pyramid multiple genes in wheat using molecular markers.

Although pyramiding is useful in some cases, most traits of interest in wheat are complex and cannot be improved sufficiently through pyramiding. To breed for quantitative traits as well as traits conferred by major-effect loci, an approach referred to as ‘forward breeding’ is preferred. With forward breeding, major-effect alleles are first introgressed into elite breeding lines which are then used in crosses. Populations which segregate for the major-effect alleles are subject to MAS in early or late generations, and then lines are derived and evaluated for all traits of interest. Anderson et al. [27] described the application of a MAS forward breeding strategy for improving Fusarium Head Blight resistance where MAS is applied for multiple loci in the F2 and F3 generations on a few breeding populations and then lines derived from these populations are phenotypically evaluated.

3.4 Genomic Selection

Genomic selection (GS) [6] is a form of MAS which is vastly different from conventional MAS in its approach. Unlike many other MAS strategies, the goal of GS is to improve the breeding germplasm as a whole for all traits of interest over multiple cycles of population improvement. GS is based on genomic estimated breeding values (GEBVs), which are estimates of individuals’ values as parents based on genomic markers. Accurate estimation of GEBVs requires phenotypic and genotypic data on a ‘training population’. This information is fed into a genomic prediction model, and GEBVs are predicted for selection candidates which have been genotyped but not necessarily phenotyped. For an ongoing wheat breeding program, breeding lines developed over the past few years can serve as the training population, as long as phenotypic and genome-wide marker data are available for these lines.

Selection based on a GEBV can be more effective than selection based on a phenotype or BLUP estimated without genomic relationships. However, the main advantage of selection based on GEBVs is that they can be estimated for individuals that have not yet been phenotyped. This allows breeders to identify parents to be used in crossing much earlier in the breeding process. For example, in a conventional wheat breeding program, selection is typically imposed among breeding lines that have undergone 2–3 years of line development and 2–3 years of testing. In a typical wheat breeding program implementing a conservative GS strategy, selection is imposed among breeding lines that have undergone 2–3 years of line development and 0–1 years of testing. This reduction in the breeding cycle duration is what leads to faster rates of genetic gain [28]. Counterintuitively, the time and effort devoted to phenotyping may remain unchanged because breeding lines will continue to undergo testing even after they are selected as parents in order to gather phenotypic data that will be needed to make variety release decisions and to update the training population.

Prior to implementing GS in a wheat breeding program, several conditions should be met. First, the breeding program should be able to routinely obtain inexpensive genome-wide marker data within 1–6 months of tissue sampling and DNA extraction. Increasing the speed of genome-wide marker data acquisition and reducing the cost of genotyping relative to phenotyping can improve the potential for using GS to shorten the breeding cycle duration and accelerate rates of genetic gain [29]. Second, a breeding program should be selecting parents largely from within the breeding program itself. This recycling of elite lines within the program is needed so that genetic gain can be achieved over cycles and so that phenotypic and marker data generated on the breeding materials can contribute to training an accurate prediction model for GS in future generations. Finally, the breeding program should be collecting and carefully managing high-quality phenotypic data on all traits of interest and on traits highly correlated with the traits of interest, which are referred to as ‘secondary traits’. Breeders may cull plants or lines based on traits that are visually observed in the field but not systematically phenotyped. In order to continue selecting for these traits using GS, it will be necessary to record and manage phenotypic data for all target traits. Data on secondary traits, while not essential, can help increase GS accuracies when used together with data on the traits of interest in multi-trait GS models [30]. For example, multi-trait GS models for predicting grain yield in wheat were shown to be more accurate than single-trait GS models when using secondary trait data in the form of aerial imagery [31].

To begin using GS, it is recommended that breeding programs start by genotyping all lines that are being phenotyped for yield and other important traits and use all available phenotypic and marker data for selection. Endelman et al. [29] showed that this strategy is ideal if the cost of genotyping is similar to or higher than the cost of phenotyping. Another advantage of genotyping all lines being phenotyped is that it also allows genotypic and phenotypic data to be accumulated. This then becomes the GS model training population for future generations. After multiple years of phenotyping and genotyping, it may be possible to accurately estimate GEBVs on lines that have not yet been phenotyped to enable selection in early stages of line development in order to reduce the breeding cycle duration and increase the rate of genetic gain.

Ideally, GS will be implemented in a two-part strategy in which selection will be performed among breeding lines as well as among individual plants in a rapid-cycle recurrent selection program. The effectiveness of this strategy can be further enhanced by integrating optimum contribution selection (OCS) [32]. OCS, reviewed by Woolliams et al. [33], is a method which optimizes how much selected parents participate in crosses in order to control how fast the population loses genetic variability. GS can lead to more rapid loss of genetic variability compared conventional selection methods largely because it enables many cycles of breeding to be performed in a short time period. It has also been observed that selected individuals are more likely to be close relatives of one another as GS accuracy decreases, leading to faster rates of inbreeding and faster losses in genetic variance per cycle [34]. Because the strategy of rapid cycle GS is based on achieving many cycles of low accuracy GS in a short period of time, genetic variance loss is expected to be especially severe in rapid cycle GS programs. Empirically, faster losses in genetic variance under GS compared to PS have been observed in a short-term recurrent experiment for quantitative stem rust resistance in wheat [4]. Veenstra et al. [5] found that GS for fructan content in wheat seeds was effective using both truncation selection and OCS but inbreeding was significantly reduced using OCS. Because populations with more genetic variability can achieve higher rates of genetic gain compared to those with less, it is important that the loss of genetic variability in GS-based breeding programs be managed, especially if a rapid cycle strategy is adopted. Given that marker genotypes and estimates of genomic relationships are available to breeders implementing GS, managing loss of genetic variability over time is feasible using estimates of genomic relationship in OCS and/or by placing a higher weight on the effects of favorable, low-frequency marker alleles [34] in the GEBV estimation procedure.

Although the complexities of implementing GS in a wheat breeding program may seem daunting, even a simple GS strategy can be useful. Most wheat breeding programs in the United States and at CIMMYT are implementing some form of GS to help improve line advancement decisions during testing and to improve parent selection. Most of these programs are not yet routinely using rapid cycle recurrent GS, multi-trait GS models, or OCS. Over time, as more analytical tools are developed and as breeders become more skilled in GS methods and analytical techniques, breeding programs will be able to evolve further to take advantage of the potential of GS to maximize rates of genetic gain in wheat using selection procedures that are increasingly complex and data-driven.

4 Key Concepts

The goal of population improvement is to enhance the genetic base of the breeding program, while selection methods aim to identify breeding lines with superior potential or performance. Recurrent selection is the predominant approach to population improvement in wheat breeding and aims to enhance the breeding population as a whole through crossing and recombination. Generating crosses in wheat is a labor-intensive process, however, both recessive and dominant male-sterility genes are available in wheat that can greatly facilitate intermating. Mass, best linear unbiased prediction, marker-assisted selection and genomic selection are commonly used selection methods in wheat breeding.

5 Conclusions

Considering the multitude of approaches to population improvement and selection methods as well as the many line development methods described in Chap. 5, breeders are tasked with determining how to best combine these into an effective breeding strategy given fixed financial resources and infrastructure. Rather than take this challenge head-on, most successful breeding programs repeat what has traditionally been done with minor modifications. Complete redesign of a breeding program is rarely undertaken, as it can be disruptive to the ongoing variety development process. However, it should be recognized that methods like GS are inherently disruptive when used to their full potential. Thus, flexibility and an open mind will be needed in order for a breeding program to develop and deploy an optimal strategy using the latest methodological advancements.