Abstract
Key message
Genomic predictions across environments and within populations resulted in moderate to high accuracies but across-population genomic prediction should not be considered in wheat for small population size.
Abstract
Genomic selection (GS) is a marker-based selection suggested to improve the genetic gain of quantitative traits in plant breeding programs. We evaluated the effects of training population (TP) composition, cross-validation design, and genetic relationship between the training and breeding populations on the accuracy of GS in spring wheat (Triticum aestivum L.). Two populations of 231 and 304 spring hexaploid wheat lines that were phenotyped for six agronomic traits and genotyped with the wheat 90 K array were used to assess the accuracy of seven GS models (RR-BLUP, G-BLUP, BayesB, BL, RKHS, GS + de novo GWAS, and reaction norm) using different cross-validation designs. BayesB outperformed the other models for within-population genomic predictions in the presence of few quantitative trait loci (QTL) with large effects. However, including fixed-effect marker covariates gave better performance for an across-population prediction when the same QTL underlie traits in both populations. The accuracy of prediction was highly variable based on the cross-validation design, which suggests the importance to use a design that resembles the variation within a breeding program. Moderate to high accuracies were obtained when predictions were made within populations. In contrast, across-population genomic prediction accuracies were very low, suggesting that the evaluated models are not suitable for prediction across independent populations. On the other hand, across-environment prediction and forward prediction designs using the reaction norm model resulted in moderate to high accuracies, suggesting that GS can be applied in wheat to predict the performance of newly developed lines and lines in incomplete field trials.
Similar content being viewed by others
References
Arruda MP, Lipka AE, Brown PJ, Krill AM, Thurber C, Brown-Guedira G, Dong Y, Foresman BJ, Kolb FL (2016) Comparing genomic selection and marker-assisted selection for Fusarium head blight resistance in wheat (Triticum aestivum L.). Mol Breed 36:84. https://doi.org/10.1007/s11032-016-0508-5
Asoro FG, Newell MA, Beavis WD, Scott MP, Jannink J-L (2011) Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. Plant Genome 4:132–144. https://doi.org/10.3835/plantgenome2011.02.0007
Bassi FM, Bentley AR, Charmet G, Ortiz R, Crossa J (2016) Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Sci 242:23–36. https://doi.org/10.1016/j.plantsci.2015.08.021
Bates D, Maechler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67:1–48. https://doi.org/10.18637/jss.v067.i01
Beales J, Turner A, Griffiths S, Snape JW, Laurie DA (2007) A pseudo-response regulator is misexpressed in the photoperiod insensitive Ppd-D1a mutant of wheat (Triticum aestivum L.). Theor Appl Genet 115:721–733. https://doi.org/10.1007/s00122-007-0603-4
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Bernardo R (2014) Genomewide selection when major genes are known. Crop Sci 54:68–75. https://doi.org/10.2135/cropsci2013.05.0315
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633–2635. https://doi.org/10.1093/bioinformatics/btm308
Breiman A, Graur D (1995) Wheat evolution. Isr J Plant Sci 43:85–98. https://doi.org/10.1080/07929978.1995.10676595
Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci 52:707–719. https://doi.org/10.2135/cropsci2011.06.0299
Calus MPL (2010) Genomic breeding value prediction: methods and procedures. Animal 4:157–164. https://doi.org/10.1017/S1751731109991352
Carter AH, Garland-Campbell K, Kidwell KK (2011) Genetic mapping of quantitative trait loci associated with important agronomic traits in the spring wheat (Triticum aestivum L.) cross ‘Louise’ × ‘Penawawa’. Crop Sci 51:84–95. https://doi.org/10.2135/cropsci2010.03.0185
Cavanagh CR, Chao S, Wang S et al (2013) Genome-wide comparative diversity uncovers multiple targets of selection for improvement in hexaploid wheat landraces and cultivars. Proc Natl Acad Sci USA 110:8057–8062. https://doi.org/10.1073/pnas.1217133110
Charmet G, Storlie E, Oury FX et al (2014) Genome-wide prediction of three important traits in bread wheat. Mol Breeding 34:1843–1852. https://doi.org/10.1007/s11032-014-0143-y
CIMMYT (2005) Laboratory protocols: CIMMYT Applied Molecular Genetics Laboratory, 3rd edn. Mexico, D.F
Clark SA, Hickey JM, Daetwyler HD, van der Werf JHJ (2012) The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes. Genet Sel Evol 44:4. https://doi.org/10.1186/1297-9686-44-4
Clark SA, Hickey JM, van der Werf JHJ (2011) Different models of genetic variation and their effect on genomic evaluation. Genet Sel Evol 43:18. https://doi.org/10.1186/1297-9686-43-18
Collard BC, Mackill DJ (2008) Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos T Roy Soc B 363:557–572. https://doi.org/10.1098/rstb.2007.2170
Crossa J, de los Campos G, Maccaferri M, Tuberosa R, Burgueño J, Pérez-Rodríguez P (2015) Extending the marker × environment interaction model for genomic-enabled prediction and genome-wide association analysis in durum wheat. Crop Sci 56:1–17. https://doi.org/10.2135/cropsci2015.04.0260
Crossa J, Pérez P, Hickey J et al (2014) Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity 112:48–60. https://doi.org/10.1038/hdy.2013.16
Cuthbert JL, Somers DJ, Brûlé-Babel AL, Brown PD, Crow GH (2008) Molecular mapping of quantitative trait loci for yield and yield components in spring wheat (Triticum aestivum L.). Theor Appl Genet 117:595–608. https://doi.org/10.1007/s00122-008-0804-5
Daetwyler HD, Bansal UK, Bariana HS, Hayden MJ, Hayes BJ (2014) Genomic prediction for rust resistance in diverse wheat landraces. Theor Appl Genet 127:1795–1803. https://doi.org/10.1007/s00122-014-2341-8
Daetwyler HD, Calus MPL, Pong-Wong R, de los Campos G, Hickey JM (2013) Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 193:347–365. https://doi.org/10.1534/genetics.112.147983
de los Campos G, Gianola D, Rosa GJM (2009a) Reproducing kernel Hilbert spaces regression: a general framework for genetic evaluation. J Anim Sci 87:1883–1887. https://doi.org/10.2527/jas.2008-1259
de los Campos G, Naya H, Gianola D et al (2009b) Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182:375–385. https://doi.org/10.1534/genetics.109.101501
de los Campos G, Gianola D, Rosa GJM, Weigel KA, Crossa J (2010) Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genet Res 92:295–308. https://doi.org/10.1017/S0016672310000285
Endelman JB (2011) Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250–255. https://doi.org/10.3835/plantgenome2011.08.0024
FAO (2020) Crop statistics. https://www.fao.org/faostat/en/#data/QC. Accessed 05 Aug 2020
Federer WT (1961) Augmented designs with one-way elimination of heterogeneity. Biometrics 17:447–473
Gao F, Wen W, Liu J et al (2015) Genome-wide linkage mapping of QTL for yield components, plant height and yield-related physiological traits in the Chinese wheat cross Zhou 8425B/Chinese Spring. Front Plant Sci 6:1099. https://doi.org/10.3389/fpls.2015.01099
Gianola D, Fernando RL, Stella A (2006) Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173:1761–1776. https://doi.org/10.1534/genetics.105.049510
Gianola D, van Kaam JBCHM (2008) Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178:2289–2303. https://doi.org/10.1534/genetics.107.084285
Goddard ME (2012) Uses of genomics in livestock agriculture. Anim Prod Sci 52:73–77. https://doi.org/10.1071/AN11180
Goddard ME, Hayes BJ (2007) Genomic selection. J Anim Breed Genet 124:323–330. https://doi.org/10.1111/j.1439-0388.2007.00702.x
Guo Y, Du Z, Chen J, Zhang Z (2017) QTL mapping of wheat plant architectural characteristics and their genetic relationship with seven QTLs conferring resistance to sheath blight. PLoS ONE 12:e0174939. https://doi.org/10.1371/journal.pone.0174939
Habier D, Fernando RL, Dekkers JC (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177:2389–2397. https://doi.org/10.1534/genetics.107.081190
Habier D, Fernando RL, Kizilkaya K, Garrick DJ (2011) Extension of the Bayesian alphabet for genomic selection. BMC Bioinform 12:186. https://doi.org/10.1186/1471-2105-12-186
Habier D, Tetens J, Seefried F-R, Lichtner P, Thaller G (2010) The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol 42:5
Hayes BJ, Bowman PJ, Chamberlain AC, Verbyla K, Goddard ME (2009) Accuracy of genomic breeding values in multi-breed dairy cattle populations. Genet Sel Evol 41:1. https://doi.org/10.1186/1297-9686-41-51
Heffner EL, Jannink J-L, Sorrells ME (2011) Genomic selection accuracy using multifamily prediction models in a wheat breeding program. Plant Genome 4:65–75. https://doi.org/10.3835/plantgenome2010.12.0029
Heffner EL, Sorrells ME, Jannink J-L (2009) Genomic selection for crop improvement. Crop Sci 49:1–12. https://doi.org/10.1016/j.cj.2018.03.001
Holland JB, Nyquist WE, Cervantes-Martínez CT (2003) Estimating and interpreting heritability for plant breeding: an update. In: Janick J (ed) Plant breeding reviews. John Wiley & Sons, New Jersey, pp 9–112
Howey R, Cordell HJ (2012) MapThin. https://www.staff.ncl.ac.uk/richard.howey/mapthin/. Accessed 17 Mar 2017
Jarquín D, Crossa J, Lacaze X et al (2014) A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theor Appl Genet 127:595–607. https://doi.org/10.1007/s00122-013-2243-1
Jarquín D, Lemes da Silva C, Gaynor RC et al (2017) Increasing genomic-enabled prediction accuracy by modeling genotype × environment interactions in Kansas wheat. Plant Genome. https://doi.org/10.3835/plantgenome2016.12.0130
Jia Y, Jannink J-L (2012) Multiple-trait genomic selection methods increase genetic value prediction accuracy. Genetics 192:1513–1522. https://doi.org/10.1534/genetics.112.144246
Kenward MG, Roger JH (1997) Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53:983–997. https://doi.org/10.2307/2533558
Lipka AE, Tian F, Wang Q et al (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28:2397–2399. https://doi.org/10.1093/bioinformatics/bts444
Littell RC, Milliken GA, Stroup WW, Wolfinger RD, Schabenberger O (2006) SAS® system for mixed models, 2nd edn. SAS Institute Inc., Cary
Lopez-Cruz M, Crossa J, Bonnett D et al (2015) Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 (Bethesda) 5:569–582. https://doi.org/10.1534/g3.114.016097
Maccaferri M, Zhang J, Bulli P et al (2015) A genome-wide association study of resistance to stripe rust (Puccinia striiformis f. sp. tritici) in a worldwide collection of hexaploid spring wheat (Triticum aestivum L.). G3 (Bethesda) 5:449–465. https://doi.org/10.1534/g3.114.014563
McCallum BD, DePauw RM (2008) A review of wheat cultivars grown in the Canadian prairies. Can J Plant Sci 88:649–677. https://doi.org/10.4141/CJPS07159
McCartney CA, Somers DJ, Humphreys DG et al (2005) Mapping quantitative trait loci controlling agronomic traits in the spring wheat cross RL4452×’AC domain’. Genome 48:870–883. https://doi.org/10.1139/g05-055
Meng L, Li H, Zhang L, Wang J (2015) QTL IciMapping: Integrated software for genetic linkage map construction and quantitative trait locus mapping in biparental populations. Crop J 3:269–283. https://doi.org/10.1016/j.cj.2015.01.001
Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829
Meuwissen THE (2009) Accuracy of breeding values of “unrelated” individuals predicted by dense SNP genotyping. Genet Sel Evol 41:35. https://doi.org/10.1186/1297-9686-41-35
N’Diaye A, Haile JK, Nilsen K et al (2018) Haplotype loci under selection in Canadian durum wheat germplasm over 60 years of breeding: association with grain yield, quality traits, protein loss and plant height. Front Plant Sci 9:1589. https://doi.org/10.3389/fpls.2018.01589
Park T, Casella G (2008) The Bayesian Lasso. J Am Stat Assoc 103:681–686. https://doi.org/10.1198/016214508000000337
Pérez-Rodríguez P, Crossa J, Bondalapati K, De Meyer G, Pita F, Gdl C (2015) A pedigree-based reaction norm model for prediction of cotton yield in multienvironment trials. Crop Sci 55:1143–1151. https://doi.org/10.2135/cropsci2014.08.0577
Pérez P, de los Campos G (2014) Genome-wide regression and prediction with the BGLR statistical package. Genetics 198:483–495. https://doi.org/10.1534/genetics.114.164442
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909. https://doi.org/10.1038/ng1847
Poland J, Endelman J, Dawson J et al (2012) Genomic selection in wheat breeding using genotyping-by-sequencing. Plant Genome 5:103–113. https://doi.org/10.3835/plantgenome2012.06.0006
R Core Team (2016) R: A language and environment for statistical computing. Vienna, Austria. https://www.R-project.org/
Randhawa HS, Asif M, Pozniak C et al (2013) Application of molecular markers to wheat breeding in Canada. Plant Breed 132:458–471. https://doi.org/10.1111/pbr.12057
Rice B, Lipka AE (2019) Evaluation of RR-BLUP genomic selection models that incorporate peak genome-wide association study signals in maize and sorghum. Plant Genome 12:180052. https://doi.org/10.3835/plantgenome2018.07.0052
Riedelsheimer C, Endelman JB, Stange M, Sorrells ME, Jannink JL, Melchinger AE (2013) Genomic predictability of interconnected biparental maize populations. Genetics 194:493–503. https://doi.org/10.1534/genetics.113.150227
SAS Institute (2015) The SAS system for windows, 9.4 edn., Cary, North Carolina
Spindel JE, Begum H, Akdemir D, Collard B, Redoña E, Jannink JL, McCouch S (2016) Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity 116:395–408. https://doi.org/10.1038/hdy.2015.113
Thavamanikumar S, Dolferus R, Thumma BR (2015) Comparison of genomic selection models to predict flowering time and spike grain number in two hexaploid wheat doubled haploid populations. G3 (Bethesda) 5:1991–1998. https://doi.org/10.1534/g3.115.019745
VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414–4423. https://doi.org/10.3168/jds.2007-0980
Wang S, Wong D, Forrest K et al (2014a) Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array. Plant Biotechnol J 12:787–796. https://doi.org/10.1111/pbi.12183
Wang Y, Mette MF, Miedaner T, Gottwald M, Wilde P, Reif JC, Zhao Y (2014b) The accuracy of prediction of genomic selection in elite hybrid rye populations surpasses the accuracy of marker-assisted selection and is equally augmented by multiple field evaluation locations and test years. BMC Genom 15:556–556. https://doi.org/10.1186/1471-2164-15-556
Whittaker JC, Thompson R, Denham MC (2000) Marker-assisted selection using ridge regression. Genet Res (Camb) 75:249–252. https://doi.org/10.1017/S0016672399004462ER
Windhausen VS, Atlin GN, Hickey JM et al (2012) Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments. G3 (Bethesda) 2:1427–1436. https://doi.org/10.1534/g3.112.003699
Winfield MO, Allen AM, Burridge AJ et al (2016) High-density SNP genotyping array for hexaploid wheat and its secondary and tertiary gene pool. Plant Biotechnol J 14:1195–1206. https://doi.org/10.1111/pbi.12485
Wolfinger R, Federer WT, Cordero-Brana O (1997) Recovering information in augmented designs, using SAS PROC GLM and PROC MIXED. Agron J 89:856–859. https://doi.org/10.2134/agronj1997.00021962008900060002x
Yang W, Tempelman RJ (2012) A Bayesian antedependence model for whole genome prediction. Genetics 190:1491–1501. https://doi.org/10.1534/genetics.111.131540
Zhong S, Dekkers JC, Fernando RL, Jannink JL (2009) Factors affecting accuracy from genomic selection in populations derived from multiple inbred lines: a barley case study. Genetics 182:355–364. https://doi.org/10.1534/genetics.108.098277
Acknowledgements
This study was conducted as part of the Canadian Triticum Applied Genomics (CTAG2) project and funded by Genome Canada (Grant number: 8310), Saskatchewan Ministry of Agriculture, Western Grains Research Foundation, Saskatchewan Wheat Development Commission, and Government of Saskatchewan. We would like to acknowledge the technical assistance from the Durum Wheat Breeding and Genetics field and molecular laboratory staff at the University of Saskatchewan.
Author information
Authors and Affiliations
Contributions
TAH designed the experiment, generated phenotypic and marker data, performed all analyses, and wrote the manuscript. SW, AN, and JMC edited the manuscript. PJH designed the experiment, developed and maintained early generation of the breeding population, and edited the manuscript. RDC and REK collected phenotypic data, and edited the manuscript. CJP acquired funding, designed the experiment, supervised the project, collected phenotypic data and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by Hiroyoshi Iwata.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
John M. Clarke: Deceased 01 February 2020.
Electronic supplementary material
Below is the link to the electronic supplementary material.
122_2020_3703_MOESM2_ESM.csv
Online Resource 2: Markers fitted as fixed effects for genomic predictions within and across populations using GS + de novo GWAS model (CSV 114 kb)
Online Resource 5: Frequency distributions of agronomic traits in the training population (PDF 3 kb)
122_2020_3703_MOESM6_ESM.pdf
Online Resource 6: Frequency distribution of agronomic traits in the breeding population. The values for the check cultivars (AC Barrie, CDC Utmost, CDC Plentiful, and Pasteur) are indicated with arrows (PDF 29 kb)
122_2020_3703_MOESM7_ESM.pdf
Online Resource 7: Quantile–quantile (Q–Q) plots of the association analysis using Mixed Linear Model with Kinship matrix and five marker-derived principal components (PDF 416 kb)
122_2020_3703_MOESM9_ESM.docx
Online Resource 9: Summary of environment specific QTL identified for six agronomic traits in the breeding population (DOCX 26 kb)
Rights and permissions
About this article
Cite this article
Haile, T.A., Walkowiak, S., N’Diaye, A. et al. Genomic prediction of agronomic traits in wheat using different models and cross-validation designs. Theor Appl Genet 134, 381–398 (2021). https://doi.org/10.1007/s00122-020-03703-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-020-03703-z