A fast genomic selection approach for large genomic data

Liu, Hailan; Chen, Guo-Bo

doi:10.1007/s00122-017-2887-3

A fast genomic selection approach for large genomic data

Original Article
Published: 07 April 2017

Volume 130, pages 1277–1284, (2017)
Cite this article

Theoretical and Applied Genetics Aims and scope Submit manuscript

Hailan Liu¹ &
Guo-Bo Chen²

1043 Accesses
8 Citations
4 Altmetric
Explore all metrics

Abstract

Key message

We propose a novel computational method for genomic selection that combines identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP.

Abstract

Genomic best linear unbiased prediction (GBLUP) has been widely used in whole-genome prediction for breeding programs. To determine the total genetic variance of a training population, a linear mixed model (LMM) should be solved via restricted maximum likelihood (REML), whose computational complexity is the cube of the sample size. We proposed a novel computational method combining identical-by-state (IBS)-based Haseman–Elston (HE) regression and best linear prediction (BLP), called HE-BLP. With this method, the total genetic variance can be estimated by solving a simple HE linear regression, which has a computational complex of the sample size squared; therefore, it is suitable for large-scale genomic data, except those with which environmental effects need to be estimated simultaneously, because it does not allow for this estimation. In Monte Carlo simulation studies, the estimated heritability based on HE was identical to that based on REML, and the prediction accuracy via HE-BLP and traditional GBLUP was also quite similar when quantitative trait loci (QTLs) were randomly distributed along the genome and their effects followed a normal distribution. In addition, the kernel row number (KRN) trait in a maize IBM population was used to evaluate the performance of the two methods; the results showed similar prediction accuracy of breeding values despite slightly different estimated heritability via HE and REML, probably due to the underlying genetic architecture. HE-BLP can be a future genomic selection method choice for even larger sets of genomic data in certain special cases where environmental effects can be ignored. The software for HE regression and the simulation program is available online in the Genetic Analysis Repository (GEAR; https://github.com/gc5k/GEAR/wiki).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Overview of Genomic Prediction Methods and the Associated Assumptions on the Variance of Marker Effect, and on the Architecture of the Target Trait

An efficient unified model for genome-wide association studies and genomic selection

Article Open access 24 August 2017

A novel genomic selection method combining GBLUP and LASSO

Article 06 February 2015

References

Bernardo R, Yu JM (2007) Prospects for genome wide selection for quantitative traits in maize. Crop Sci 47:1082–1090
Article Google Scholar
Bommert P, Nagasawa NS, Jackson D (2013) Quantitative variation in maize kernel row number is controlled by the FASCIATED EAR2 locus. Nat Genet 45:334–337
Article CAS PubMed Google Scholar
Chen GB (2014) Estimating heritability of complex traits from genome-wide association studies using IBS-based Haseman-Elston regression. Front Genet 5:107
PubMed PubMed Central Google Scholar
Chen GB (2016) On the reconciliation of missing heritability for genome-wide association studies. Eur J Hum Genet 24:1810–1816
Article CAS PubMed Google Scholar
Golan D, Lander ES, Rosset S (2014) Measuring missing heritability: Inferring the contribution of common variants. Proc Natl Acad Sci U S A 111:E5272–E5281
Article CAS PubMed PubMed Central Google Scholar
Hao XM, Li XW, Yang XH, Li JS (2014) Transferring a major QTL for oil content using marker-assisted backcrossing into an elite hybrid to increase the oil content in maize. Mol Breeding 34:739–748
Article CAS Google Scholar
Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME (2009) Invited review: Genomic selection in dairy cattle: Progress and challenges. J Dairy Sci 92:433–443
Article CAS PubMed Google Scholar
Heffner EL, Sorrells ME, Jannink JL (2009) Genomic selection for crop improvement. Crop Sci 49:1–12
Article CAS Google Scholar
Henderson CR (1976) A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69–83
Article Google Scholar
Hu Z, Yang RC (2014) Marker-based estimation of genetic parameters in genomics. PLoS ONE 9:e102715
Article PubMed PubMed Central Google Scholar
Jannink JL, Lorenz AJ, Iwata H (2010) Genomic selection in plant breeding: from theory to practice. Brief Funct Genom 9:166–177
Article CAS Google Scholar
Jena KK, Jeung JU, Lee JH, Choi HC, Brar DS (2006) High-resolution mapping of a new brown planthopper (BPH) resistance gene, Bph18(t), and marker-assisted selection for BPH resistance in rice (Oryza sativa L.). Theor Appl Genet 112:288–297
Article CAS PubMed Google Scholar
Lande R, Thompson (1990) Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743–756
CAS PubMed PubMed Central Google Scholar
Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer Associates, Sunderland
Google Scholar
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829
CAS PubMed PubMed Central Google Scholar
Misztal I (2016) Inexpensive computation of the inverse of the genomic relationship matrix in populations with small effective population size. Genetics 202:401–409
Article CAS PubMed Google Scholar
Resende MF Jr, Munoz P, Resende MD, Garrick DJ, Fernando RL, Davis JM, Jokela EJ, Martin TA, Peter GF, Kirst M (2012) Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190:1503–1510
Article PubMed PubMed Central Google Scholar
Riedelsheimer C, Czedik-Eysenberg A, Grieder C, Lisec J, Technow F, Sulpice R, Altmann T, Stitt M, Willmitzer L, Melchinger AE (2012) Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat Genet 44:217–220
Article CAS PubMed Google Scholar
Ritland K (1996) A marker-based method for inference about quantitative inheritance in natural population. Evol Int J org Evol 50:1062–1073
Article Google Scholar
Ritland K (2000) Marker-inferred relatedness as a tool for detecting heritability in nature. Mol Ecol 9:1195–1204
Article CAS PubMed Google Scholar
Sillanpää MJ (2011) On statistical methods for estimating heritability in wild populations. Mol Ecol 20:1324–1332
Article PubMed Google Scholar
Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redoña E, Atlin G, Jannink JL, McCouch SR (2015) Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS. Genet 11:e1004982
Article PubMed PubMed Central Google Scholar
Spindel JE, Begum H, Akdemir D, Collard B, Redona E, Jannink J-L, McCouch S (2016) Genome-wide prediction models that incorporate de novo GWAS are a powerful new tool for tropical rice improvement. Heredity 116:395–408
Article CAS PubMed PubMed Central Google Scholar
Usai MG, Goddard ME, Hayes BJ (2009) LASSO with cross-validation for genomic selection. Genet Res 91:427–436
Article CAS Google Scholar
Van der Werf JHJ, de Boer IJM (1990) Estimation of additive genetic variance when base populations are selected. J Anim Sci 68:3124–3132
Article PubMed Google Scholar
Xiao SH, Zhang HP, You GX, Zhang XY, Yan CS, Chen X (2012) Integration of marker-assisted selection for resistance to pre-harvest sprouting with selection for grain-filling rate in breeding of white-kernelled wheat for the Chinese environment. Euphytica 188:85–88
Article Google Scholar
Xu S, Zhu D, Zhang Q (2014) Predicting hybrid performance in rice using genomic best linear unbiased prediction. Proc Natl Acad Sci U S A 111:12456–12461
Article CAS PubMed PubMed Central Google Scholar
Yu X, Li X, Guo T, Zhu C, Wu Y, Mitchell SE, Roozeboom KL, Wang D, Wang ML, Pederson GA, Tesso TT, Schnable PS, Bernardo R, Yu J (2016) Genomic prediction contributing to a promising global strategy to turbocharge gene banks. Nat Plants. doi:10.1038/NPLANTS.2016.150
Google Scholar
Zhang Z, Liu J, Ding X, Bijma P, de Koning D-J, Zhang Q (2010) Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix. PLoS ONE 5:e12648
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors are grateful to the editor and the two anonymous reviewers for their constructive comments, and Peter Bommert for providing information for the maize IBM population.

Author information

Authors and Affiliations

Maize Research Institute, Sichuan Agricultural University, Chengdu, Sichuan Province, 611130, China
Hailan Liu
Evergreen Landscape and Architecture Studio, Xixi Road 562, Hangzhou, Zhejiang Province, 310026, China
Guo-Bo Chen

Authors

Hailan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Bo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hailan Liu or Guo-Bo Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Communicated by Mikko J. Sillanpaa.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 33 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Chen, GB. A fast genomic selection approach for large genomic data. Theor Appl Genet 130, 1277–1284 (2017). https://doi.org/10.1007/s00122-017-2887-3

Download citation

Received: 26 September 2016
Accepted: 27 February 2017
Published: 07 April 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00122-017-2887-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast genomic selection approach for large genomic data