Integrated nested Laplace approximation inference and cross-validation to tune variance components in estimation of breeding value
- 237 Downloads
The main aim of this study was to compare a number of recently proposed Bayesian and frequentist statistical methods for the estimation of genetic parameters and to apply the cross-validation (CV) approach in order to tune the variance components in simulated and field plant breeding datasets. We were especially interested in whether the CV approach was capable of improving the prediction accuracy of breeding values which have been obtained using the residual (or restricted/reduced) maximum likelihood and Markov chain Monte Carlo estimation tools. We showed that the nonsampling-based Bayesian inference method of integrated nested Laplace approximation (INLA) can be used for rapid and accurate estimation of genetic parameters in linear mixed models with multiple random effects such as additive, dominance, and genotype-by-environment interaction effects. Moreover, we also compared the INLA estimates with results obtained using Markov chain Monte Carlo and restricted maximum likelihood methods. In other studies, K-fold CV is primarily used for comparing method performance; however, here we showed that the K-fold CV method can be used to tune genetic parameters and minimize the prediction error in the estimation of breeding value . We also compared the K-fold CV results with different generalized cross-validation methods which are much faster to compute. Analysis results obtained from field and simulated datasets are presented.
KeywordsCross-validation Generalized cross-validation INLA Bayesian analysis Estimation of genetic parameters
We thank Håvard Rue and Anna Marie Holand for their help with the implementation of our model using the R-INLA package. We also thank Jarrod D. Hadfield for helping us with the use of the MCMCglmm package.
- Butler D, Cullis BR, Gilmour A, Gogel B (2007) ASReml-R reference manual. Queensland Department of Primary Industries and Fisheries, BrisbaneGoogle Scholar
- Hadfield JD (2010) MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Softw 33(2):1–22Google Scholar
- Hanson W, Robinson H (eds) (1963) Statistical genetics and plant breeding. Publication 982. National Academy of Science–National Research Council, Washington DCGoogle Scholar
- Henderson CR (1985) Best linear unbiased prediction of nonadditive genetic merits in noninbred populations. J Anim Sci 60(1):111–117Google Scholar
- Lynch M, Walsh B (1998) Genetics and analysis of quantitative traits. Sinauer Assoc., SunderlandGoogle Scholar
- Rue H, Held L (2004) Gaussian Markov random fields: theory and applications. CRC Press, Boca RatonGoogle Scholar
- Sorensen D, Gianola D (2002) Likelihood, Bayesian and MCMC methods in quantitative genetics. Springer, New YorkGoogle Scholar
- Utz HF, Melchinger AE, Schön CC (2000) Bias and sampling error of the estimated proportion of genotypic variance explained by quantitative trait loci determined from experimental data in maize using cross validation and validation with independent samples. Genetics 154(4):1839–1849PubMedCentralPubMedGoogle Scholar
- Verbeke G, Molenberghs G (2009) Linear mixed models for longitudinal data. Springer, New YorkGoogle Scholar