Introduction

Genetic gains in spring wheat yield are currently in the region of 0.5–1% per year (Sharma et al. 2012; Fischer et al. 2014; Crespo-Herrera et al. 2017) and due almost entirely to conventional approaches. Yield gains are achieved through unspecified recombination of genes of minor effect when elite lines are intercrossed, as well as the introgression of new genetic diversity from landraces or wild relatives (Braun et al. 2010), typically as sources of disease resistance and grain quality traits (Ortiz et al. 2008; Sehgal et al. 2015). Additional information about candidate parents can help design more complementary crosses. For example, high density markers, as well as pedigree information/coefficient of parentage (COP) can be used to maximize genetic diversity between elite parents (Sukumaran et al. 2017). However, two elite lines—even with diverse marker profiles or low COP—may still involve the same traits and alleles to achieve good expression of yield. Genetic and physiological characterization of lines can help make more strategic crosses in terms of complementary traits and alleles. By crossing lines with good yield and complementary traits, the probability of achieving cumulative gene action for yield increases, compared to crossing uncharacterized lines or ones expressing the same traits.

Since understanding of the genetic basis of cultivar level differences in yield potential is very limited, trait based crossing currently relies heavily on phenotypic information (Reynolds and Langridge 2016), although in the few backgrounds and environments in which genetic dissection has been performed, markers that pertain to yield or its components can also be used to guide hybridization strategies (e.g. Griffiths et al. 2015). If the objective is to increase tolerance to abiotic stress, empirical data and models (Richards 2006; Ghanem et al. 2014; Cossani and Reynolds 2015) can help suggest parents with potentially complementary trait/allele profiles. Similarly, conceptual models of complementary traits can be used to design strategic crosses for increasing yield potential (Reynolds et al. 2012). In this context, it can be helpful when characterizing and selecting complementary parents to consider a ‘pyramid’ of traits going from highly integrative to less integrative (Fig. 1). For example, since yield is a function of total biomass (BM) and of partitioning biomass to grains, i.e. harvest index (HI), one can hypothesize that crosses between lines with a large BM (and related traits) on the one hand, and high expression of HI (and related traits) on the other, is more likely to be complementary than crossing high yield × high yield lines, assuming both parents have adequate expression of yield.

Fig. 1
figure 1

Trait hierarchy in relation to their degree of integration depicting the main drivers of yield (biomass and harvest index), and some of their components/subcomponents (none exhaustive). Int interception, DW dry weight, WSC water soluble carbohydrate, PS photosynthesis

This paper presents the outcome of a considerable body of work where the value of source/sink crosses were tested in a realistic breeding context. The idea that biomass contributes to yield is borne out by theory—a large biomass pushes the theoretical yield margin—and in practice, since biomass is commonly associated with genetic gains in yield (Aisawi et al. 2015). The importance of improving biomass is also borne out by the apparent stagnation of harvest index in recent decades (Foulkes et al. 2011). However, crossing and selecting for biomass alone is not sufficient since the two traits often show negative association (Aisawi et al. 2015). Furthermore, putting positive selection pressure on HI increases the chances that among the many competing sinks expressed during stem elongation (including for height), grain sinks will be favored (Foulkes et al. 2011). The disadvantage of selecting for high biomass is that it can bring in lateness and lodging tendency associated with increased height. While this tendency was observed in the current study, results also showed that these linkages can be broken even in the best yielding lines.

The breeding results described in the current work were founded on strategic crossing using—as trait sources—a large body of genetic resources that had been selected either at high yield potential, or under heat stress (Reynolds et al. 2015). Selected genetic resources constituted the female parents while high-yielding, broadly adapted lines with appropriate height, phenology and disease resistance were used as males, and as recurrent male parents if crossing with a female landrace or synthetic hexaploids. Crosses were designed to complement “source” with “sink” traits, in the sense that at least one parent was selected for favorable expression of biomass (source) and the other for favorable expression of sink related traits—including harvest index (HI), kernel number per m2 (KNO), thousand kernel weight (TKW), and grains per spike (GSP)—.

The overall objective of the study was to demonstrate in a breeding context that complementing source with sink traits—through strategic crossing—can achieve significant yield gains. The specific objectives of the study were: (i) to derive suitable progeny from strategic crosses—mainly in terms of yield, biomass, phenology, height and rust resistance, targeted to favorable and heat stressed environments, respectively; (ii) evaluate the performance of the selected progeny across a range of target environments; and (iii) compare the performance of the best lines with that of local checks and elite CIMMYT material.

Materials and methods

Germplasm: parents, international nurseries, and checks

The lines used as parents for trait-based crossing in this study came from a number of different sources—described below—having been evaluated for yield and other agronomic traits using methodologies developed previously (Reynolds et al. 2007).

IWIN nurseries

The International Wheat Improvement Network (IWIN) routinely develops and tests new wheat lines at more than 700 field sites in over 90 countries. Breeding is conducted at a few strategic research hubs to annually develop up to 1000 new high yielding, disease-resistant lines targeted to major agro-ecologies. Germplasm delivered is used as sources of outstanding traits for breeding; as candidates for variety release; and for research into local adaptation (Braun et al. 2010; Gourdji et al. 2012; Reynolds et al. 2017).

Landraces

Approximately 15,000 spring wheat landraces from the World Wheat Collection have recently been pre-screened under high temperature stress. Several hundred landraces were selected for superior yield and biomass. Some of these were identified by the Focused Identification of Germplasm Strategy (FIGS) (Sehgal et al. 2015).

Primary synthetics

CIMMYT has generated 2000 so-called ‘synthetic wheat’ genotypes using novel genetic variation in diploid and tetraploid wheat (Ortiz et al. 2008). These primary synthetics can be crossed easily with elite bread wheat lines, and backcross derivatives are already well represented in new high yielding lines distributed by IWIN. Field studies have also shown that synthetic-derived material can confer significant advantages in terms of yield and biomass under heat stress (Cossani and Reynolds 2015). All 2000 lines were recently screened at CIMMYT under yield potential conditions as well as under heat and drought stress, and lines were identified that expressed superior performance in terms of biomass and/or yield under each of these conditions as well as some in combination (Reynolds et al. 2015).

WYCYT and SATYN international nurseries

Two types of nurseries were developed in this study, using almost the same breeding methodology; one targeted at high yield potential irrigated environments (Wheat Yield Collaboration Yield Trial, WYCYT) and the other targeted at heat stressed, irrigated environments (Stress Adaptive Trait Yield Nursery, SATYN). Field temperature profiles used to select the parents and progeny for these nurseries are shown below (Fig. 2). In both cases, crosses were designed where at least one parent was selected for favorable expression of biomass (source) and the other for favorable expression of sink related traits—including harvest index (HI), kernel number per m2 (KNO), thousand kernel weight (TKW), spike density (SPM2), and grains per spike (GSP), under their respective yield potential and irrigated heat stress environments. Incidentally, results of the 1st WYCYT are not presented in this study as lines were generally not higher yielding than elite CIMMYT checks, though many showed superior biomass. Data of the 1st and 3rd SATYN nurseries are not presented as they were targeted to rainfed environments and will be reported separately.

Fig. 2
figure 2

Long term average day, night, and daily mean temperatures during the winter/spring wheat cycle at the CENEB experiment station, Obregon, NW Mexico (5 year mean 2010/11–2014/15). Temperate trials (WYCYT and early generations of SATYN) emerge in mid-December and mature approximately 120 days later during April (indicated by thin arrows). Heat trials—SATYN evaluations from F4 generation onwards—emerge about 10 weeks later (in early March) and have a cycle duration of about 80 days (indicated by thick arrows)

Checks

Two classes of check were used in this study: (1) local checks (LCH) selected by national program collaborators that represent their best available advanced line adapted to local conditions; and (2) the best available elite CIMMYT advanced lines, selected based either on their superior performance in experiments at the breeding site (e.g. Sokoll), or where available, the return of recent IWIN data (e.g. Quaiu). Their superior yield potential was verified in multi-year yield trials at the breeding/selection site (Obregon, Campo Experimental Norman E. Borlaug (CENEB CIMMYT)). The best yielding elite (CIMMYT) check from each international nursery—considering average yield across all international sites (namely, Vorobey in 2nd SATYN; Roelfs in 2nd WYCYT; Sokoll in 4th SATYN; and Borlaug-100 in 3rd WYCYT), yielded 7.4 t ha−1, 7.5 t ha−1, 7.6 t ha−1, and 7.7 t ha−1, respectively (not statistically different), where Borlaug-100 is the current gold standard in terms of yield potential.

Breeding methodology

The breeding approach was designed to test new strategic crosses—i.e. physiological trait (PT) crosses—in a relatively short space of time; in this case strategic crosses designed to combine source and sink traits in a 5–6 year, rapid cycle breeding time frame. The scheme is illustrated in Fig. 3. The growing conditions at the CIMMYT Obregon breeding station (27° 20′ N; 109° 54′ W; 38 m above sea level) are that of an irrigated, spring wheat growing environment (Reynolds et al. 2007). Most of the breeding occurred at this site, namely: crossing; selections for traits: plant type, phenology and leaf rusts resistance which were made in F2 and F4 bulks of 3–4 thousand plants/cross (spaced at approximately 2–3 cm on rows 12 cm apart); yield measured in small (2 m2) F5 yield plots derived from the seed of selected individual F4 plants (augmented designs); and yield and yield components at F7 in replicated trials. The environment at the CIMMYT El Batán experimental station (19°31′ N; 98°50′ W; 2249 m above sea level) is that of a largely rain-fed, temperate spring wheat growing environment (10 year—2007–2016—mean temperature for the June–September wheat cycle was 17.1 °C). This site was used for growing F1 plants, backcrossing in the case of exotic parents, selection of F3 plants from bulks (mainly for rust resistance as well as height and phenology), and for seed multiplication in F6 generation (sown using 12–20 relatively uniform spikes bulked from the F5 yield plot) to generate enough seed for F7 yield trials in Obregon and for further multiplication in a quarantine zone (in Mexicali, Sonora) for international yield trials in F8. The international nurseries comprised of the best new PT F8 progeny, namely lines with superior or equal yield and/or final biomass to the elite checks, as well as elite CIMMYT checks. All crossing and selection was made in the field without fungicide in order to incorporate an acceptable level of disease resistance which would otherwise confound yield data in international trials. In the case of the SATYN, the segregating generations until F3 were sown in the same environment as the WYCYT, but selections in F4 and beyond were made under heat stress by delayed sowing (Fig. 2).

Fig. 3
figure 3

Rapid cycle pre-breeding scheme to test new strategic crosses in an international context, within a 5-year time frame taking advantage of 2 cycles per year at El Batán (BAT) in the state of Mexico, and Obregon (OBG) in the state of Sonora

Performance of WYCYT and SATYN nurseries at the breeding site

The selected progeny included in the WYCYT and SATYN nurseries were evaluated in replicated trials for yield and yield components using standard procedures as described previously for yield potential and heat stress environments (Reynolds et al. 2007). In summary, both environments are none-limiting in terms of irrigation, fertilizer, pest and disease control. However, the heat trials emerge about 10 weeks later (in early March) than the temperate sown trials (emerging mid-December) and as a result of warmer day and night temperatures (Fig. 2) experience a truncated life cycle of about 80 days from emergence to physiological maturity compared to temperate sown trials where the life cycle lasts approximately 120 days.

Evaluation of performance at international sites

The trials were grown at 50 sites located across some of the major spring wheat growing regions worldwide providing contrasting spring wheat production environments (Table 1). All sites were high radiation environments for which climatic data are summarized in Fig. 4 and Table 1, see also climatic data for CENEB and South Asian sites in Pask et al. (2014). Trials were conducted in the spring wheat season to correspond with recommended sowing and harvest times, depending on individual environments and contrasting between northern and southern hemispheres. At the majority of sites SATYN trials were planted later than WYCYT trials to increase exposure to heat stress. The two trials were treated as independent experiments. Trials were grown in well managed conditions; at all sites appropriate fertilization was implemented to avoid yield limitations, whilst weed and pest control were applied according to local best practice. Irrigation was applied by gravity-fed flood irrigation, and both trials were grown under fully irrigated conditions.

Table 1 Description of experimental sites
Fig. 4
figure 4

Map of selected phenotyping sites showing long term average maximum and minimum temperatures (5 year mean 2010/11–2014/15) during the growing season (5 months from date of sowing). Maximum temperatures shown in red, and minimum temperatures shown in blue. (Color figure online)

The 2nd and 3rd WYCYT were grown in the 2013/14 and 2015/16 wheat cycles respectively, both with 42 lines; and the 2nd and 4th SATYN were grown in the 2012/13 and 2014/15 wheat cycles respectively, consisting of 50 and 28 lines respectively. The experimental design at each location was a randomized alpha-lattice with two replications per entry (Barreto et al. 1997) giving a total of 84 plots for both WYCYTs, and 100 and 56 plots for 2nd and 4th SATYN respectively. At most locations, plants were sown on flat beds generally of 2.5–3.0 m length with 4–6 rows (20–25 cm between rows). All plots provided a total harvestable area of on average >3 m2. Seed rate varied according to plot size, but was approximately 120 kg ha−1 to give a plant stand of approximately 300 plants m−2 assuming around an 80% survival rate.

The number of days to heading (DH) or anthesis (ANT) was recorded when the spikes of 50% of plants in a plot had emerged (Zadoks stage 59) or extruded 50% of anthers (Zadoks stage 65) respectively, and to physiological maturity (MAT) was recorded when 50% of the peduncles in a plot were ripe (Zadoks stage 87) (Zadoks et al. 1974). When all plots reached physiological maturity, plant height (HGT) was determined by measuring the distance between the base of the stem and the top of the spike excluding awns. From total plot harvest, grain yield and total above-ground biomass (at ‘field dry’ moisture content of approximately 12%; i.e. 88% dry weight) and thousand kernel weight (TKW) were determined using standard protocols (see Pask et al. 2012), and harvest index (HI), kernels per square metre (KNO) and grains per spike calculated.

Statistical techniques

Linear mixed models (LMMs) were used to analyse differences between genotypes in yield and biomass using the lme4 (Bates et al. 2015) package in the R statistical environment (v. 3.2.1) (R Core Team 2015). Random effects were included to account for nesting of replications within sites, and of sub-blocks within replications. For each trial LMMs were used to generate adjusted estimates of yield for each genotype at each site. Clusters of sites were identified that had similar patterns of yield across genotypes (i.e. a similar GxE interaction; as per clusters of sites presented in Tables) using the cluster detection feature of the corrplot function (Wei and Simko 2016) using (1-correlation) as a distance measure, and the default clustering algorithm (complete linkage method) was used to identify clusters. The number of clusters was chosen by visually inspecting a plot of the correlation matrix for the sites, and choosing the smallest number of clusters (3–5) that resulted in only positive correlations within each cluster. Subsequent analysis was run both across all sites in each trial and within each GxE cluster of sufficient size. In each case we ran an LMM to generate adjusted estimates of average yield and biomass across sites. Analyses tested whether each PT selected genotype had a significantly higher yield and biomass than (i) the local check (LCH) and (ii) the best non-PT selected genotype (elite CIMMYT lines; ECH), using the general linear hypothesis test implemented in the glht function in the multcomp package (Hothorn et al. 2008) to correct for multiple comparisons in each case using Dunnett’s method for comparison with a control (the local check). To assess the overall success of PT selected lines in each trial we also compared the mean yield and biomass of (a) top three PT lines versus top three non-PT lines and (b) all PT lines versus all non-PT lines, using t-tests with the adjusted site means for each genotype as data points to ensure within site correlations were accounted for.

Results

Performance of progeny from strategic PT crosses in the CIMMYT breeding environment

2nd and 3rd WYCYT

In the CIMMYT breeding environment in Obregon, NW Mexico, the 2nd and 3rd WYCYT (irrigated, yield potential) nurseries showed average yields of 6.6 and 6.8 t ha−1 respectively (Tables 2, 3), with the yield of the best genotype (PT) registering 8.3 t ha−1 (see Supplemental Tables 1 and 2 for individual genotype data). Trials were managed to approach rather than maximize yield potential since the latter invariably results in lodging of some plots. Average biomass for both trials was around 14.5 t ha−1 with the best genotype registering 18.7 t ha−1 in 3rd WYCYT. The cycle lengths of genotypes ranged from 114 to 128 days across both trials, and were on average a week longer for 3rd WYCYT. The best 3 PT lines expressed significantly more yield and biomass than the best 3 checks (all of which were elite lines) in both trials. In 3rd WYCYT, the average of all PT lines showed 5% more yield (P = 0.23) and 11% more biomass (P < 0.01) than the average of the checks. In the 2nd WYCYT the average yield and biomass of all PT lines was not statistically different that of the averages for checks. Considering both nurseries, while plant height was on average 7.5 cm taller for the PT lines as a group than that of the checks, the best yielding PT lines were not necessarily taller than checks. The PT lines were on average 2 days later maturing than the checks, but again the best performing PT lines were within the same range as the checks.

Table 2 Agronomic traits of 2nd WYCYT at the selection environment Obregon (combined 2015 and 2016 cycles) and across all target environments and within clusters determined by GxE modelling (2014)
Table 3 Agronomic traits of 3rd WYCYT at the selection environment Obregon and across all target environments and within clusters determined by GxE modelling (2016)

2nd and 4th SATYN

In the CIMMYT breeding environment in Obregon, NW Mexico, heat stress truncated cycle length by about 35% compared to the WYCYT nurseries, while yields were reduced by around 60%, and plant height by about 40% compared to the temperate environment (Tables 4, 5). The 2nd and 4th SATYN nurseries showed average yields of 2.5 and 3.3 t ha−1, respectively, with the best yielding line (PT) registered at 4.6 t ha−1 for 4th SATYN (see Supplemental Tables 3 and 4 for individual genotype data). In 4th SATYN, the best 3 PT lines showed on average 37% more yield than the 3 checks (P = 0.13), and 28% more biomass (P = 0.18). However, when comparing the mean of all PT lines with the mean of all checks there was no appreciable difference. In the 2nd SATYN, differences in performance between PT material and the checks were not apparent in the selection environment.

Table 4 Agronomic traits of 2nd SATYN at the selection environment Obregon and across all target environments and within clusters determined by GxE modelling (2016)
Table 5 Agronomic traits of 4th SATYN at the selection environment Obregon and across all target environments and within clusters determined by GxE modelling (2016)

For both 2nd and 4th SATYN nurseries, the truncated cycle length and plant heights tended to minimize differences among lines in these traits, and as with the WYCYT nurseries, the best performing PT lines showed similar cycle length and height as the checks.

Ranking of PT progeny across international target environments

When considering the absolute ranking of lines averaged across all international sites as well as within the low GxE clusters, the majority of new PT lines out-yielded local checks for all 4 nurseries (Table 6). The same was the case when considering individual sites where the majority of PT lines also out-yielded the local check (data not shown). A number of PT lines outperformed the best elite (CIMMYT) checks, with some variation between nurseries and clusters (Tables 6, 7). In general PT lines from the WYCYT performed better than those from the SATYN relative to the respective CIMMYT checks. Absolute ranking of yield—among other criteria such as maturity class, height, disease score, etc.—typically determine selections of lines as potential candidates for release. Relatively small increases in yield represented by genetic gains in the annual to decadal long timescale, while important for farmers, are not easily ratified by stringent statistical tests due to the relatively large errors (e.g. plot to plot variation) compared to the absolute magnitude of genetic gain (typically 0.5–5% in this timeframe); therefore, demonstrations of breeding progress typically consider genetic gains over several decades (Gourdji et al. 2012; Sharma et al. 2012; Fischer et al. 2014; Crespo-Herrera et al. 2017). Despite the extremely restricted number of years for the current study, demonstration of proof of concept requires formal tests. In order to do this, statistical comparisons were made of the best PT line with the local check, of the best 3 PT lines with the best 3 elite CIMMYT checks, and of the average yield of all PT lines with the average of all checks.

Table 6 Summary of yield ranking of germplasm in 2nd and 3rd WYCYT and 3rd and 4th SATYN across all sites and in low GxE clusters
Table 7 Summary table of sites in each cluster

Comparison of PT lines with that of local checks and elite CIMMYT lines across international target environments

2nd WYCYT

The average yield of all genotypes across all international sites was 5.1 t ha−1 (Table 2). When considering the average performance of all PT lines across all sites, the new material showed a 6% higher yield than the average of all checks (P = 0.04). The best PT line showed a 23% higher yield than the local check (P < 0.01), while the average for the best 3 PT lines expressed 5% more yield than the average for the best 3 elite CIMMYT checks (P = 0.26). The trend was similar for most of the clusters except C5. The highest biomass expressed by a PT line averaged across all sites was 20% higher than the local check (P < 0.01) and while average biomass of all PT lines was 5% higher than all checks, statistical significance was poor (P = 0.32), probably due to the high CV. On average PT lines produced more and heavier grains per m2 than the average of all checks (by 4%, P = 0.17; and 2%, P = 0.10 respectively). The best PT line produced 26% more grains per m2 (P < 0.01) which were 20% heavier (P < 0.01) than the local check, while the average for the best 3 PT lines produced 7% more grains per m2 (P = 0.15) which were 13% heavier than the best 3 elite CIMMYT checks (P < 0.01). Duration to heading were on average similar across genotypes. PT lines were slightly taller on average than all checks (100 cm vs. 98 cm).

3rd WYCYT

The average yield of all genotypes across all international sites was 5.5 t ha−1 (Table 3). When considering the average performance of all PT lines across all sites, the new material showed a 5% higher yield than the average of all checks (P = 0.07). The best PT line showed a 14% higher yield than the local check (P < 0.01), while the average for the best 3 PT lines expressed 5% more yield than the average for the best 3 elite checks (P = 0.26). The trend was similar for 2 of the 3 clusters. The average biomass of all PT lines across all sites was 9% higher than the average of all checks (P = 0.02) and the highest biomass expressed by a PT line averaged across all sites was 15% higher than the local check (P = 0.08). The top 3 PT lines expressed 13% more biomass than the top 3 elite checks (P = 0.09). On average PT lines produced heavier grains per m2 than the average of all checks (by 6%, P < 0.01), whilst the GNO was similar. The best PT line produced 12% more grains per m2 (P = 0.02) which were 13% heavier than the local check (P < 0.01), while the average for the best 3 PT lines produced 9% more grains per m2 (P = 0.17) which were 7% heavier than the best 3 elite CIMMYT checks (P = 0.04). Duration to heading was on average 2% longer for PT lines than all checks (95 days vs. 93 days). PT lines were 6% taller on average than all checks (104 cm vs. 98 cm).

2nd SATYN

The average yield of all genotypes across all international sites was 3.9 t ha−1 (Table 4). When considering the average performance of all PT lines across all sites, the new material showed a 10% higher yield than the average of all checks (P < 0.01). The best PT line showed a 21% higher yield than the local check (P < 0.01). However, the average for the best 3 PT lines was not different to the average for the best 3 elite CIMMYT checks. The trend was similar for all the clusters of sites identified. The average biomass of all PT lines across all sites was 6% higher than the average of all checks (P = 0.15) and the highest biomass expressed by a PT line averaged across all sites was 34% higher than the local check (P < 0.01). On average PT lines produced more and heavier grains per m2 than the average of all checks (by 3%, P = 0.28; and 8%, P < 0.01 respectively). The best PT line produced 12% more grains per m2 (P = 0.07) which were 20% heavier (P < 0.01) than the local check, while the average for the best 3 PT lines produced fewer grains per m2 but which were 8% heavier (P = 0.03) than the best 3 elite CIMMYT checks. Duration to heading was on average slightly longer for PT lines than all checks (78 days vs. 77 days). PT lines were 4% shorter on average than all checks (95 cm vs. 98 cm).

4th SATYN

The average yield of all genotypes across all international sites was 4.1 t ha−1 (Table 5). When considering the average performance of all PT lines across all sites, the new material was slightly lower than the average of all checks (4%) but not significantly. However, when considering the 4 clusters of sites, the average yield of the best 3 yielding PT lines ranged from 5 to 17% more than the 3 elite CIMMYT checks (significance ranged from P = 0.12 to P = 0.67). Biomass was not significantly different. On average PT lines produced fewer but heavier grains per m2 (P = 0.17) than the average of all checks (by −5 and 3% respectively). The best PT line produced 11% more grains per m2 (P = 0.01) which were 11% heavier (P < 0.01) than the local check, while the average for the best 3 PT lines produced 6% more grains per m2 which were 15% heavier than the best 3 elite CIMMYT checks. Duration to heading was on average the same across genotypes (83 days). PT lines were 4% taller on average than all checks (99 cm vs. 95 cm).

Correlation analysis of trait expression in the breeding environment with yield and trait expression at international sites

Correlation coefficients between the average yield across international sites with expression of yield, yield components, phenology and height for the same genotypes at the breeding environment in Obregon did not show any unexpected patterns. In many cases yield and biomass showed significant correlations with average yield across international sites, and occasionally one of the other yield components also in particular TKW. Phenology in the breeding environment was not a predictor of yield internationally, while height showed a weak negative association with yield. The latter was perhaps partly related to the absence of height reduction alleles associated with exotic material in their pedigree. This was supported by a negative association of height with HI. As expected the more heritable traits TKW, phenology and height showed relatively strong associations between the selection and target environments with correlation coefficients generally in the range 0.65–0.95.

Site clusters

Four main clusters of sites can be identified within the dendrogram for yield (Fig. 5); (i) BA-Jo to RO-Fu, (ii) EG-So to MX-Ya_htdrtfeb, (iii) IN-Dh to MX-Ya, and (iv) IR-Za to IR-Pa. Within two of these main clusters there is evidence of smaller clusters of very similar sites; EG-So to MX-Ya_htdrtmar in cluster (ii), and IN-Ka to IN-Ja in cluster (iii). Sites within Mexico are well represented throughout the dendrogram, confirming the value of the centralized breeding platform hosted in this country.

Fig. 5
figure 5

Dendrogram for similarity in yield ranking across sites using data across all four yield trials. This reflects the correlations among sites in the patterns of yields across different genotypes, converted to a distance measure (1-correlation). In cases where a pair of sites did not have any genotypes in common, the missing value was replaced with the mean correlation prior to conversion into a distance measure

Discussion

Genetic gains from strategic crosses

Multi-location yield trials confirmed that strategic source-sink crosses targeted to high yield potential and heat stressed environments respectively, resulted in a number of new PT progeny with substantial yield gains across a full range of target environments (Fig. 6). Genetic yield gains are never easy to demonstrate for a number of reasons. First and foremost, yield gains tend to be marginal; for conventional spring wheat breeding at CIMMYT, typical gains average around 0.5% per year when analyzed over two or more decades (Sharma et al. 2012; Crespo-Herrera et al. 2017). In addition, there is usually a large degree of experimental error in field trials. This was indicated in the current study by high values for coefficient of variation even when sites were clustered using GxE modelling (Tables 2, 3, 4, 5). Nonetheless, most new PT lines were superior in yield and biomass to local checks in almost all environments, thus providing valuable new germplasm to national wheat programs. Since a large proportion of the new PT material was derived from exotic sources including landraces and synthetics, it also provides new sources of genetic variation, in agronomically good backgrounds. Showing superiority over elite CIMMYT checks is a larger challenge which nonetheless was achieved in most environments (Fig. 6). While statistical significance was often marginal, the tendency across all 4 nurseries indicated either the superiority of the best new PT lines compared with the best CIMMYT checks, or the superiority of all new PT lines as a group compared with all checks, and in some cases, both. The strategy showed genetic gains in both temperate and hot, irrigated environments. Furthermore, the fact that biomass was up to 9% higher for PT lines as a group compared to the average for checks (except for 4th SATYN), indicates the potential of this material if used in crosses with material better adapted to local conditions in terms of harvest index.

Fig. 6
figure 6

Percent genetic gains expressed for: (i) the best PT line over the local check (LCH), (ii) the best 3 PT lines over the best 3 elite checks (ECH), and (iii) of the average of all PT lines over the average of all checks; for a all sites, and b the largest cluster of low GxE sites (12, 17, 11 and 13 sites for 2nd WYCYT, 3rd WYCYT, 2nd SATYN, and 4th SATYN, respectively) of each international nursery. See Supplemental Tables 1–4 for data used to calculate genetic gains

The relationship between trait expression in parents with that of their progeny (i.e. PT lines in WYCYT and SATYN)—to determine the source and nature of genetic gains—will be the subject of future analysis from ongoing studies in the breeding environment (the current study intended to focus on genetic gains in target environments). However, as an example, the highest yielding line internationally from the 3rd WYCYT, namely a PT line with the pedigree MEX94.27.1.20/3/SOKOLL//ATTILA/3*BCN/4/PUB94.15.1.12/WBLL1, was the product of a cross between the parents P1 (MEX94.27.1.20/3/SOKOLL//ATTILA/3*BCN) and P2 (PUB94.15.1.12/WBLL1), which incidentally both have Mexican landraces in their pedigree. Parents and progeny sown side by side in the breeding environment in replicated field trials, showed that while both parents had similar yield (2% difference), P2 expressed a greater ‘source’ as indicated by 11% more biomass at maturity than P2, while P2 expressed a greater sink as indicated by 8% larger HI, 6% larger GSP, and 9% large TKW than P2. In the same environment, the new PT line expressed 11% more yield, 4% larger HI, and 2% more TKW and biomass at maturity than the best value of parental expression. The new PT line expressed 5% more yield and almost 20% larger TKW and biomass at maturity than the elite check Borlaug 100, which is the best yielding elite CIMMYT line currently available; it’s superior performance over Borlaug 100 was also reflected at international sites in terms of yield, biomass and TKW (Table 3).

Balancing biomass (source) with yield components (sink) through breeding

While a number of studies have suggested that yield and radiation use efficiency can be increased by improving source and sink balance (Slafer and Savin 1994; Reynolds et al. 2012), this is the first time the hypothesis has been tested in a realistic breeding context. Results indicate that selecting parents with high expression of biomass (source) and crossing to sources with good expression of sink traits (HI, KNO, TKW etc.) resulted in genetic gains for both yield and biomass, supporting the idea that a more favorable sink capacity boosts RUE as well as yield. Clearly the phenomenon needs to be further tested, nonetheless, evidence to support the idea has come to light recently from an extremely high yielding durum line (Cirno). The line shows relatively low RUE before anthesis, then after setting a very high number of grains with high grain weight potential, switches to a very high RUE during grain-filling (Molero and Reynolds, unpublished data). In the meantime, studies are underway in the breeding environment comparing the best PT progeny of the WYCYT and SATYN lines with their parents to track the expression of source and sink traits from parents to progeny—as outlined above—and their apparent interaction.

Agronomic traits: phenology, height, disease resistance, and broad adaptation of PT material

While the PT lines as a whole in the WYCYT and SATYN nurseries represented a wider range of height and maturity class than the checks, the best material was generally similar in these characteristics to the checks. Furthermore, leaf and yellow rust screening at hotspots in Mexico indicated that the majority of the material had acceptable levels of resistance. Much of the PT material was derived from relatively exotic germplasm like landraces and synthetics and subjected to very restricted rounds of selection (Fig. 3). Therefore, broad adaptation would not necessarily be expected. This was reflected by the fact that while in most clusters (Table 6), and sites (not shown), a number of PT lines outperformed any of the checks, they were not necessarily the same lines across all clusters/sites. Such material, if consistently yielding well, would be useful locally. Where material expresses high biomass and acceptable yield (i.e. not optimally adapted) it could be crossed with locally adapted elite lines as a source of RUE. Nonetheless, a significant number of lines were superior in yield and biomass to the checks across all sites and/or within clusters as shown by statistical analysis, indicating broad adaptive features in many cases (Fig. 6).

PT breeding approach and wider breeding objectives

The PT pre-breeding approach used in this study is intended to test hypotheses about potentially useful traits in a realistic context, namely through making strategic crosses and testing selected progeny in target environments (Reynolds and Langridge 2016). A large body of literature exists on many different candidate traits and alleles for boosting yield (e.g. see examples in the bibliography and therein), but until tested in a realistic breeding context, they remain hypothetical. In fact, strategic crossing is the only way to test the true value of a trait—or constellation of traits or the alleles associated with them—for use in mainstream breeding. Without such proof of concept, productive crop improvement programs are understandably reluctant to incorporate new traits or methodologies into already successful pipelines. The most promising of these candidate traits and alleles should be tested through pre-breeding (e.g. see International Wheat Yield Partnership; http://iwyp.org/).

Furthermore, while the products of such pre-breeding may lack a number of important traits, for example, related to end-use quality and specific disease resistance profiles, if they show superior yield or biomass, they can be used as parental sources in backcrossing or other breeding methodologies to introduce new genetic diversity into well-established backgrounds, enriching the genepool. Occasionally, products of experimental pre-breeding may have all the requisite agronomic traits, as evidenced by the dissemination of PT lines in CIMMYT international nurseries like SAWYT (http://wheatatlas.org/nurseries/iwin), as well as their promotion as candidates for release by national wheat programs, and—in the case of Pakistan—the recent release of PT lines as new cultivars (Manes, Imtiaz and Reynolds, unpublished).

Mainstream breeding programs are incorporating trait-based approaches in progeny selection more commonly now (e.g. Rutkoski 2016), especially with the advent of high throughput phenotyping which can help overcome problems associated with low heritability/repeatability of complex traits related to yield (Tattaris et al. 2016). However, this study shows that trait-based approaches have at least as much value in the selection of parents for strategic crossing (Richards 2006; Reynolds and Langridge 2016). When characterizing and selecting complementary parents it is useful to consider a ‘pyramid’ of traits going from highly integrative to more simple (Fig. 1). In this study, crosses between lines with favorable expression of traits related BM on the one hand, and high HI on the other, were successful. Nonetheless, BM and HI are necessarily a function of genetically simpler traits which can be considered in greater detail when making more complex strategic crosses (Fig. 1; Reynolds et al. 2012).

Physiological breeding as described here is not restricted to strategic crossing of phenotypic traits. It can also make use of markers, though currently these are more likely to be for alleles associated with genes of major effect—such as Ppd, Vrn, and Rht in wheat (Eagles et al. 2014). Nonetheless, since selecting among early generation progeny for expression of complex traits is not always feasible (due in part to very large numbers), it is expected that genomic selection—potentially in combination with high throughput phenotyping of few integrative traits like canopy temperature and vegetative indices (Rutkoski 2016)—will find valuable application in selection of early generations based on strategic crossing for complex physiological traits. Marker assisted selection (MAS) has not delivered as expected (Langridge and Reynolds 2015) but remains a possibility, especially if QTL × environment interactions can be more effectively modeled (Millet et al. 2016).

Conclusions

Results provide a proof of concept that selecting parents with high expression of biomass (source) and crossing to lines with good expression of HI and other yield components (sink) can boost genetic gains. This is perhaps to be expected, since these traits are either drivers or components of yield, and their strategic combination is more likely to result in the accumulation of complementary alleles than when simply crossing among elite high yielding lines, which may have largely the same physiological or genetic basis for achieving yield. Some of the best new PT lines have already been selected as candidates for national release by the country programs of many of the authors herein, and two PT lines have been released in Pakistan for heat stressed conditions contributing to the climate resilience of wheat—namely Pakistan-13: MEX94.27.1.20/3/SOKOLL//ATTILA/3*BCN, and Borlaug-16: SOKOLL/3/PASTOR//HXL7573/2*BAU. Results of the current study support the use of genetic resources like landraces and products of wide crossing (i.e. synthetic wheat) in providing new and better sources of important traits like biomass and kernel size (see pedigrees in supplemental Tables); the cultivars released in Pakistan include a Mexican landrace, and synthetic-derived SOKOLL, in their respective pedigrees. Such successful germplasm provides opportunities for molecular studies to demonstrate genetic contributions from exotic parents to support more wide scale molecular applications in breeding, genetic resource screening, and gene editing.