Skip to main content
Log in

The impact of population structure on genomic prediction in stratified populations

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Impacts of population structure on the evaluation of genomic heritability and prediction were investigated and quantified using high-density markers in diverse panels in rice and maize.

Abstract

Population structure is an important factor affecting estimation of genomic heritability and assessment of genomic prediction in stratified populations. In this study, our first objective was to assess effects of population structure on estimations of genomic heritability using the diversity panels in rice and maize. Results indicate population structure explained 33 and 7.5 % of genomic heritability for rice and maize, respectively, depending on traits, with the remaining heritability explained by within-subpopulation variation. Estimates of within-subpopulation heritability were higher than that derived from quantitative trait loci identified in genome-wide association studies, suggesting 65 % improvement in genetic gains. The second objective was to evaluate effects of population structure on genomic prediction using cross-validation experiments. When population structure exists in both training and validation sets, correcting for population structure led to a significant decrease in accuracy with genomic prediction. In contrast, when prediction was limited to a specific subpopulation, population structure showed little effect on accuracy and within-subpopulation genetic variance dominated predictions. Finally, effects of genomic heritability on genomic prediction were investigated. Accuracies with genomic prediction increased with genomic heritability in both training and validation sets, with the former showing a slightly greater impact. In summary, our results suggest that the population structure contribution to genomic prediction varies based on prediction strategies, and is also affected by the genetic architectures of traits and populations. In practical breeding, these conclusions may be helpful to better understand and utilize the different genetic resources in genomic prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Albrecht T, Wimmer V, Auinger HJ, Erbe M, Knaak C, Ouzunova M, Simianer H, Schön CC (2011) Genome-based prediction of testcross values in maize. Theor Appl Genet 123:339–350

    Article  PubMed  Google Scholar 

  • Bastiaansen J, Coster A, Calus M, Van Arendonk J, Bovenhuis H (2012) Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures. Genet Sel Evol 44:3

    Article  PubMed Central  PubMed  Google Scholar 

  • Beavis WD (1994) QTL analysis: power, precision and accuracy. In: Paterson AH (ed) Molecular dissection of complex traits. CRC Press, Boca Raton, pp 145–162

    Google Scholar 

  • Bernardo R, Yu J (2007) Prospects for genomewide selection for quantitative traits in maize. Crop Sci 47:1082–1090

    Article  Google Scholar 

  • Cook JP, McMullen MD, Holland JB, Tian F, Bradbury P, Ross-Ibarra J, Buckler ES, Flint-Garcia SA (2012) Genetic architecture of maize kernel composition in the nested association mapping and inbred association panels. Plant Physiol 158:824–834

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Crossa J, de los Campos G, Pérez P, Gianola D, Burgueño J, Araus JL, Makumbi D, Singh RP, Dreisigacker S, Yan J, Arief V, Banziger M, Braun HJ (2010) Predictions of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186:713–724

    Article  CAS  PubMed  Google Scholar 

  • Crossa J, Pérez P, Hickey J, Burgueño J, Ornella L, Cerón-Rojas J, Zhang X, Dreisigacker S, Babu R, Li Y, Bonnett D, Mathews K (2013) Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity. doi:10.1038/hdy.2013.16

    PubMed  Google Scholar 

  • Daetwyler HD, Swan AA, van der Werf JHJ, Hayes BJ (2012) Accuracy of pedigree and genomic predictions of carcass and novel meat quality traits in multi-breed sheep data assessed by cross-validation. Genet Sel Evol 44:33

    Article  PubMed Central  PubMed  Google Scholar 

  • de los Campos G, Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel K, Cotes JM (2009) Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182:375–385

    Article  PubMed  Google Scholar 

  • de los Campos G, Gianola D, Rosa G, Weige K, Crossa J (2010) Semiparametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genet Res 92:295–308

    Article  Google Scholar 

  • de Oliveira EJ, de Resende DV, da Silva Santos V, Ferreira CF, Oliveira GAF, da Silva MS, de Oliveira LA, Aguilar-Vildoso GI (2012) Genome-wide selection in cassava. Euphytica 187:263–276

    Article  CAS  Google Scholar 

  • Edriss V, Fernando RL, Su GS, Lund MS, Guldbrandtsen B (2013) The effect of using genealogy-based haplotypes for genomic prediction. Genet Sel Evol 45:5

    Article  PubMed Central  PubMed  Google Scholar 

  • Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Prentice Hall, London

    Google Scholar 

  • Flint-Garcia SA, Thuillet AC, Yu JM, Pressoir G, Romero SM, Mitchell SE, Doebley J, Kresovich S, Goodman MM, Buckler ES (2005) Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J 44:1054–1064

    Article  CAS  PubMed  Google Scholar 

  • Garris AJ, Tai TH, Coburn J, Kresovich S, McCouch S (2005) Genetic structure and diversity in Oryza sativa L. Genetics 169:1631–1638

    Article  CAS  PubMed  Google Scholar 

  • Guo Z, Tucker D, Lu J, Kishore V, Gay G (2012) Evaluation of genome-wide selection efficiency in maize nested association mapping populations. Theor Appl Genet 124:261–275

    Article  PubMed  Google Scholar 

  • Guo Z, Tucker D, Wang D, Basten C, Ersoz E, Briggs W, Lu J, Li M, Gay G (2013) Accuracy of across-environment genome-wide prediction in maize nested association mapping populations. G3 3:263–272

    Article  PubMed  Google Scholar 

  • Habier D, Fernando RL, Dekkers JCM (2007) The impact of genetic relationship information on genome-assisted breeding values. Genetics 177(4):2389–2397

    CAS  PubMed  Google Scholar 

  • Habier D, Fernando RL, Garrick DJ (2013) Genomic-BLUP decoded: a look into the black box of genomic prediction. Genetics 194(3):597–607

    Article  CAS  PubMed  Google Scholar 

  • Hayes B, Bowman P, Chamberlain A, Goddard M (2009) Invited review: genomic selection in dairy cattle: progress and challenges. J Dairy Sci 92:433–443

    Article  CAS  PubMed  Google Scholar 

  • Heffner EL, Jannink JL, Iwata H, Souza E, Sorrells ME (2011) Genomic selection accuracy for grain quality traits in biparental wheat populations. Crop Sci 51:2597–2606

    Article  Google Scholar 

  • Jannink JL, Lorenz AJ, Iwata H (2010) Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics 9:166–177

    Article  CAS  PubMed  Google Scholar 

  • Janss LG, de los Campos G, Sheehan N, Sorensen D (2012) Inferences from genomic models in stratified populations. Genetics 192:693–704

    Article  PubMed  Google Scholar 

  • Jonas E, de Koning DJ (2013) Does genomic selection have a future in plant breeding? Trends Biotechnol 31(9):497–504

    Article  CAS  PubMed  Google Scholar 

  • Kärkkäinen HP, Sillanpää MJ (2012) Back to basics for Bayesian model building in genomic selection. Genetics 191:969–987

    Article  PubMed  Google Scholar 

  • Karoui S, Carabaño MJ, Díaz C, Legarra A (2012) Joint genomic evaluation of French dairy cattle breeds using multiple-trait models. Genet Sel Evol 44:39

    Article  PubMed Central  PubMed  Google Scholar 

  • Lande R, Thompson R (1990) Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743–756

    CAS  PubMed  Google Scholar 

  • Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048

    Article  CAS  PubMed  Google Scholar 

  • Lee SH, van der Werf JHJ, Hayes BJ, Goddard ME, Visscher PM (2008) Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet 4(10):e1000231

    Article  PubMed Central  PubMed  Google Scholar 

  • Legarra A, Robert-Granie C, Manfredi E, Elsen JM (2008) Performance of genomic selection in mice. Genetics 180:611–618

    Article  PubMed  Google Scholar 

  • Lorenzana RE, Bernardo R (2009) Accuracy of genotypic value predictions for marker-based selection in biparental plant populations. Theor Appl Genet 120:151–161

    Article  PubMed  Google Scholar 

  • Luan T, Woolliams JA, Lien S, Kent M, Svendsen M, Meuwissen TH (2009) The accuracy of genomic selection in Norwegian red cattle assessed by cross-validation. Genetics 183:1119–1126

    Article  PubMed  Google Scholar 

  • Makowsky R, Pajewski NM, Klimentidis YC, Vazquez AI, Duarte CW, Allison DB, de los Campos G (2011) Beyond missing heritability: prediction of complex traits. PLoS Genet 7(4):e1002051

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517

    Article  CAS  PubMed  Google Scholar 

  • Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829

    CAS  PubMed  Google Scholar 

  • Moser G, Tier B, Crump RE, Khatkar MS, Raadsma HW (2009) A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers. Genet Sel Evol 41:56

    Article  PubMed Central  PubMed  Google Scholar 

  • Mujibi FDN, Nkumah JD, Durunna ON, Stothard P, Mah J, Wang Z, Basarab J, Plastow G, Crews DH Jr, Moore SS (2011) Accuracy of genomic breeding values for residual feed intake in crossbred beef cattle. J Dairy Sci 89:3353–3361

    CAS  Google Scholar 

  • Nakaya A, Isobe SN (2012) Will genomic selection be a practical method for plant breeding? Ann Bot 110(6):1303–1316

    Article  PubMed  Google Scholar 

  • Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2:2074–2093

    Article  CAS  Google Scholar 

  • Piyasatian N, Fernando R, Dekkers JCM (2007) Genomic selection for marker-assisted improvement in line crosses. Theor Appl Genet 115:665–674

    Article  CAS  PubMed  Google Scholar 

  • Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal component analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909

    Article  CAS  PubMed  Google Scholar 

  • Price AL, Zaitlen NA, Reich D, Patterson N (2010) New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11:459–463

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Pritchard JK, Donnelly P (2001) Case-control studies of association in structured or admixed populations. Theor Popul Biol 60:227–237

    Article  CAS  PubMed  Google Scholar 

  • Riedelsheimer C, Czedik-Eysenberg A, Grieder C, Lisec J, Technow F, Sulpice R, Altmann T, Stitt M, Willmitzer L, Melchinger AE (2012) Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet 44:217–220

    Article  CAS  PubMed  Google Scholar 

  • Riedelsheimer C, Endelman JB, Stange M, Sorrells ME, Jannink JL, Melchinger AE (2013) Genomic predictability of interconnected bi-parental maize populations. Genetics. doi:10.1534/genetics.113.150227

    PubMed  Google Scholar 

  • Rolf MM, Taylor JF, Schnabel RD, Mckay S, McClure M, Northcutt S, Kerley M, Weaber R (2010) Impact of reduced marker set estimation of genomic relationship matrices on genomic selection for feed efficiency in Angus cattle. BMC Genet 11:24

    Article  PubMed Central  PubMed  Google Scholar 

  • Saatchi M, McClure MC, McKay SD, Rolf MM, Kim J et al (2011) Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation. Genet Sel Evol 43:1–16

    Article  Google Scholar 

  • Technow F, Bürger A, Melchinger AE (2013) Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for heterotic groups. G3 3:197–203

    Article  PubMed  Google Scholar 

  • VanRaden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414–4423

    Article  CAS  PubMed  Google Scholar 

  • Villumsen TM, Janss L, Lund MS (2008) The importance of haplotype length and heritability using genomic selection in dairy cattle. J Anim Breed Genet 126:3–13

    Article  Google Scholar 

  • Visscher PM, Yang J, Goddard MEA (2012) A commentary on ‘common SNPs explain a large proportion of the heritability for human height’ by Yang et al. (2010). Twin Res Hum Genet 13:517–524

    Article  Google Scholar 

  • Windhausen VS, Atlin CN, Hickey JM, Crossa J, Jannink JL, Sorrells ME, Raman B, Cairns JE, Tarekegne A, Semagn K, Beyene Y, Grudloyma P, Technow F, Riedelsheimer C, Melchinger AE (2012) Effectiveness of genomic predictions of maize hybrid performance in different breeding populations and environments. G3 2:1427–1436

    Article  PubMed  Google Scholar 

  • Wolc A, Stricker C, Arango J, Settar P, Fulton JE, O’Sullivan NP, Preisinger R, Habier D, Fernardo R, Garrick D, Lamont SJ, Dekkers JCM (2011) Breeding value prediction for production traits in layer chickens using pedigree or genomic relationships in a reduced animal model. Genet Sel Evol 43:5

    Article  PubMed Central  PubMed  Google Scholar 

  • Wray NR, Yang J, Hayes BJ, Price AL, Michael E, Goddard ME, Visscher PM (2013) Pitfalls of predicting complex traits from SNPs. Nat Rev Genet 14(7):507–515

    Article  CAS  PubMed  Google Scholar 

  • Würschum T, Reif JC, Kraft T, Janssen G, Zhao YS (2013) Genomic selection in sugar beet breeding populations. BMC Genet 14:85

    Article  PubMed Central  PubMed  Google Scholar 

  • Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42:565–569

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Yu J, Pressoir G, Briggs WH, Bi IV, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208

    Article  CAS  PubMed  Google Scholar 

  • Zhao KY, Tung CW, Eizenga GC, Wright MH, Ali L, Price AH, Norton GJ, Islam MR, Reynolds A, Mezey J, McClung AM, Bustamante CD, McCouch SR (2011) Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat Commun 2:467

    Article  PubMed Central  PubMed  Google Scholar 

  • Zhao YS, Gowda M, Liu WX, Würschum T, Maurer HP, Longin FH, Ranc N, Reif JC (2012) Accuracy of genomic selection in European maize elite breeding populations. Theor Appl Genet 124:769–776

    Article  PubMed  Google Scholar 

  • Zhong SQ, Dekkers JCM, Fernando RL, Jannink JL (2009) Factors affecting accuracy from genomic selection in populations derived from multiple inbred lines: a barley case study. Genetics 182:355–364

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

The authors of the current manuscript would like to thank researchers and institutions who contributed to the development of the rice and maize diversity panels. In addition, the authors would like to express gratitude to the editor and three anonymous reviewers for their detailed input in assessment and improvement of the manuscript.

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhigang Guo.

Additional information

Communicated by J. Crossa.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 371 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, Z., Tucker, D.M., Basten, C.J. et al. The impact of population structure on genomic prediction in stratified populations. Theor Appl Genet 127, 749–762 (2014). https://doi.org/10.1007/s00122-013-2255-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-013-2255-x

Keywords

Navigation