Theoretical and Applied Genetics

, Volume 130, Issue 6, pp 1277–1284 | Cite as

A fast genomic selection approach for large genomic data

Original Article

Abstract

Key message

We propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP.

Abstract

Genomic best linear unbiased prediction (GBLUP) has been widely used in whole-genome prediction for breeding programs. To determine the total genetic variance of a training population, a linear mixed model (LMM) should be solved via restricted maximum likelihood (REML), whose computational complexity is the cube of the sample size. We proposed a novel computational method combining identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP. With this method, the total genetic variance can be estimated by solving a simple HE linear regression, which has a computational complex of the sample size squared; therefore, it is suitable for large-scale genomic data, except those with which environmental effects need to be estimated simultaneously, because it does not allow for this estimation. In Monte Carlo simulation studies, the estimated heritability based on HE was identical to that based on REML, and the prediction accuracy via HE-BLP and traditional GBLUP was also quite similar when quantitative trait loci (QTLs) were randomly distributed along the genome and their effects followed a normal distribution. In addition, the kernel row number (KRN) trait in a maize IBM population was used to evaluate the performance of the two methods; the results showed similar prediction accuracy of breeding values despite slightly different estimated heritability via HE and REML, probably due to the underlying genetic architecture. HE-BLP can be a future genomic selection method choice for even larger sets of genomic data in certain special cases where environmental effects can be ignored. The software for HE regression and the simulation program is available online in the Genetic Analysis Repository (GEAR; https://github.com/gc5k/GEAR/wiki).

Notes

Acknowledgements

The authors are grateful to the editor and the two anonymous reviewers for their constructive comments, and Peter Bommert for providing information for the maize IBM population.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflicts of interest.

Supplementary material

122_2017_2887_MOESM1_ESM.docx (32 kb)
Supplementary material 1 (DOCX 33 KB)

References

  1. Bernardo R, Yu JM (2007) Prospects for genome wide selection for quantitative traits in maize. Crop Sci 47:1082–1090CrossRefGoogle Scholar
  2. Bommert P, Nagasawa NS, Jackson D (2013) Quantitative variation in maize kernel row number is controlled by the FASCIATED EAR2 locus. Nat Genet 45:334–337CrossRefPubMedGoogle Scholar
  3. Chen GB (2014) Estimating heritability of complex traits from genome-wide association studies using IBS-based Haseman-Elston regression. Front Genet 5:107PubMedPubMedCentralGoogle Scholar
  4. Chen GB (2016) On the reconciliation of missing heritability for genome-wide association studies. Eur J Hum Genet 24:1810–1816CrossRefPubMedGoogle Scholar
  5. Golan D, Lander ES, Rosset S (2014) Measuring missing heritability: Inferring the contribution of common variants. Proc Natl Acad Sci U S A 111:E5272–E5281CrossRefPubMedPubMedCentralGoogle Scholar
  6. Hao XM, Li XW, Yang XH, Li JS (2014) Transferring a major QTL for oil content using marker-assisted backcrossing into an elite hybrid to increase the oil content in maize. Mol Breeding 34:739–748CrossRefGoogle Scholar
  7. Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME (2009) Invited review: Genomic selection in dairy cattle: Progress and challenges. J Dairy Sci 92:433–443CrossRefPubMedGoogle Scholar
  8. Heffner EL, Sorrells ME, Jannink JL (2009) Genomic selection for crop improvement. Crop Sci 49:1–12CrossRefGoogle Scholar
  9. Henderson CR (1976) A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69–83CrossRefGoogle Scholar
  10. Hu Z, Yang RC (2014) Marker-based estimation of genetic parameters in genomics. PLoS ONE 9:e102715CrossRefPubMedPubMedCentralGoogle Scholar
  11. Jannink JL, Lorenz AJ, Iwata H (2010) Genomic selection in plant breeding: from theory to practice. Brief Funct Genom 9:166–177CrossRefGoogle Scholar
  12. Jena KK, Jeung JU, Lee JH, Choi HC, Brar DS (2006) High-resolution mapping of a new brown planthopper (BPH) resistance gene, Bph18(t), and marker-assisted selection for BPH resistance in rice (Oryza sativa L.). Theor Appl Genet 112:288–297CrossRefPubMedGoogle Scholar
  13. Lande R, Thompson (1990) Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743–756PubMedPubMedCentralGoogle Scholar
  14. Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer Associates, SunderlandGoogle Scholar
  15. Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829PubMedPubMedCentralGoogle Scholar
  16. Misztal I (2016) Inexpensive computation of the inverse of the genomic relationship matrix in populations with small effective population size. Genetics 202:401–409CrossRefPubMedGoogle Scholar
  17. Resende MF Jr, Munoz P, Resende MD, Garrick DJ, Fernando RL, Davis JM, Jokela EJ, Martin TA, Peter GF, Kirst M (2012) Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190:1503–1510CrossRefPubMedPubMedCentralGoogle Scholar
  18. Riedelsheimer C, Czedik-Eysenberg A, Grieder C, Lisec J, Technow F, Sulpice R, Altmann T, Stitt M, Willmitzer L, Melchinger AE (2012) Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet 44:217–220CrossRefPubMedGoogle Scholar
  19. Ritland K (1996) A marker-based method for inference about quantitative inheritance in natural population. Evol Int J org Evol 50:1062–1073CrossRefGoogle Scholar
  20. Ritland K (2000) Marker-inferred relatedness as a tool for detecting heritability in nature. Mol Ecol 9:1195–1204CrossRefPubMedGoogle Scholar
  21. Sillanpää MJ (2011) On statistical methods for estimating heritability in wild populations. Mol Ecol 20:1324–1332CrossRefPubMedGoogle Scholar
  22. Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redoña E, Atlin G, Jannink JL, McCouch SR (2015) Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS. Genet 11:e1004982CrossRefPubMedPubMedCentralGoogle Scholar
  23. Spindel JE, Begum H, Akdemir D, Collard B, Redona E, Jannink J-L, McCouch S (2016) Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity 116:395–408CrossRefPubMedPubMedCentralGoogle Scholar
  24. Usai MG, Goddard ME, Hayes BJ (2009) LASSO with cross-validation for genomic selection. Genet Res 91:427–436CrossRefGoogle Scholar
  25. Van der Werf JHJ, de Boer IJM (1990) Estimation of additive genetic variance when base populations are selected. J Anim Sci 68:3124–3132CrossRefPubMedGoogle Scholar
  26. Xiao SH, Zhang HP, You GX, Zhang XY, Yan CS, Chen X (2012) Integration of marker-assisted selection for resistance to pre-harvest sprouting with selection for grain-filling rate in breeding of white-kernelled wheat for the Chinese environment. Euphytica 188:85–88CrossRefGoogle Scholar
  27. Xu S, Zhu D, Zhang Q (2014) Predicting hybrid performance in rice using genomic best linear unbiased prediction. Proc Natl Acad Sci U S A 111:12456–12461CrossRefPubMedPubMedCentralGoogle Scholar
  28. Yu X, Li X, Guo T, Zhu C, Wu Y, Mitchell SE, Roozeboom KL, Wang D, Wang ML, Pederson GA, Tesso TT, Schnable PS, Bernardo R, Yu J (2016) Genomic prediction contributing to a promising global strategy to turbocharge gene banks. Nat Plants. doi: 10.1038/NPLANTS.2016.150 Google Scholar
  29. Zhang Z, Liu J, Ding X, Bijma P, de Koning D-J, Zhang Q (2010) Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix. PLoS ONE 5:e12648CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Maize Research InstituteSichuan Agricultural UniversityChengduChina
  2. 2.Evergreen Landscape and Architecture StudioHangzhouChina

Personalised recommendations