Skip to main content

Advertisement

Log in

Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model

  • Original Paper
  • Published:
Genetica Aims and scope Submit manuscript

Abstract

Genomic prediction is feasible for estimating genomic breeding values because of dense genome-wide markers and credible statistical methods, such as Genomic Best Linear Unbiased Prediction (GBLUP) and various Bayesian methods. Compared with GBLUP, Bayesian methods propose more flexible assumptions for the distributions of SNP effects. However, most Bayesian methods are performed based on Markov chain Monte Carlo (MCMC) algorithms, leading to computational efficiency challenges. Hence, some fast Bayesian approaches, such as fast BayesB (fBayesB), were proposed to speed up the calculation. This study proposed another fast Bayesian method termed fast BayesC (fBayesC). The prior distribution of fBayesC assumes that a SNP with probability γ has a non-zero effect which comes from a normal density with a common variance. The simulated data from QTLMAS XII workshop and actual data on large yellow croaker were used to compare the predictive results of fBayesB, fBayesC and (MCMC-based) BayesC. The results showed that when γ was set as a small value, such as 0.01 in the simulated data or 0.001 in the actual data, fBayesB and fBayesC yielded lower prediction accuracies (abilities) than BayesC. In the actual data, fBayesC could yield very similar predictive abilities as BayesC when γ ≥ 0.01. When γ = 0.01, fBayesB could also yield similar results as fBayesC and BayesC. However, fBayesB could not yield an explicit result when γ ≥ 0.1, but a similar situation was not observed for fBayesC. Moreover, the computational speed of fBayesC was significantly faster than that of BayesC, making fBayesC a promising method for genomic prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of data

Raw DNA sequencing reads were deposited in NCBI with the project accession of PRJNA309464 and SRA accession of SRR3114179. The SNP data were deposited in European Variation Archive (EVA) with the project accession of PRJEB22261.

References

  • Ao J, Mu Y, Xiang LX et al (2015) Genome sequencing of the perciform fish Larimichthys crocea provides insights into molecular and genetic mechanisms of stress adaptation. PLoS Genet 11:e1005118

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Browning BL, Browning SR (2016) Genotype imputation with millions of reference samples. Am J Hum Genet 98:116–126

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Campos GDL, Naya H, Gianola D et al (2009) Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182:375–385

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cheng H, Long Q, Garrick DJ et al (2015) A fast and efficient Gibbs sampler for BayesB in whole-genome analyses. Genet Sel Evol 47:1–7

    Article  CAS  Google Scholar 

  • Daetwyler HD, Calus MP, Pong-Wong R et al (2013) Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking. Genetics 193:347–365

    Article  PubMed  PubMed Central  Google Scholar 

  • Goddard ME, Hayes BJ (2007) Genomic selection. J Animal Breed Genet 124:323–330

    Article  CAS  Google Scholar 

  • Habier D, Fernando RL, Kizilkaya K et al (2011) Extension of the bayesian alphabet for genomic selection. BMC Bioinform 12:1–12

    Article  Google Scholar 

  • Kang HM, Zaitlen NA, Wade CM et al (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723

    Article  PubMed  PubMed Central  Google Scholar 

  • Knürr T, Läärä E, Sillanpää MJ (2011) Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors. Genet Res 93:303–318

    Article  Google Scholar 

  • Legarra A, Misztal I (2008) Technical note: Computing strategies in genome-wide selection. J Dairy Sci 91:360–366

    Article  CAS  PubMed  Google Scholar 

  • Li H, Durbin R (2010) Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760

    Article  CAS  Google Scholar 

  • Li H, Wang J, Bao Z (2015) A novel genomic selection method combining GBLUP and LASSO. Genetica 143:299–304

    Article  CAS  PubMed  Google Scholar 

  • Luan T, Woolliams JA, Lien S et al (2009) The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation. Genetics 183:1119–1126

    Article  PubMed  PubMed Central  Google Scholar 

  • Lund MS, Sahana G, Koning DJD et al. (2009) Comparison of analyses of the QTLMAS XII common dataset. I: genomic selection. BMC Proc 3(Suppl 1):S1

    Article  PubMed  PubMed Central  Google Scholar 

  • Mckenna A, Hanna M, Banks E et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Meuwissen TH (2009) Accuracy of breeding values of ‘unrelated’ individuals predicted by dense SNP genotyping. Genet Sel Evol 41:35

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829

    CAS  PubMed  PubMed Central  Google Scholar 

  • Meuwissen TH, Solberg TR, Shepherd R et al (2009) A fast algorithm for BayesB type of prediction of genome-wide estimates of genetic value. Genet Sel Evol 41:1–10

    Article  Google Scholar 

  • Mutshinda CM, Sillanpaa MJ (2010) Extended Bayesian LASSO for multiple quantitative trait loci mapping and unobserved phenotype prediction. Genetics 186:1067–1075

    Article  PubMed  PubMed Central  Google Scholar 

  • Park T, Casella G (2008) The Bayesian Lasso. J Am Stat Assoc 103:681–686

    Article  CAS  Google Scholar 

  • Perea C, De La Hoz JF, Cruz DF et al (2016) Bioinformatic analysis of genotype by sequencing (GBS) data with NGSEP. BMC Genom 17(Suppl 5):1–13

    Google Scholar 

  • Shepherd RK, Meuwissen TH, Woolliams JA (2010) Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers. BMC Bioinform 11:529

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58::267–288

    Google Scholar 

  • Usai MG, Goddard ME, Hayes BJ (2009) LASSO with cross-validation for genomic selection. Genet Res 91:427–436

    Article  CAS  Google Scholar 

  • Vanraden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91::4414–4423

    Article  CAS  Google Scholar 

  • Verbyla KL, Bowman PJ, Hayes BJ et al. (2010) Sensitivity of genomic selection to using different prior distributions. BMC Proc 4:S5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xi Y, Di L, Fei L et al (2013) HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinform 14:68–70

    Article  CAS  Google Scholar 

  • Xiao S, Han Z, Wang P et al (2015) Functional marker detection and analysis on a comprehensive transcriptome of large yellow croaker by next generation sequencing. PLoS ONE 10:e124432

    Google Scholar 

  • Xiao S, Wang P, Dong L et al (2016) Whole-genome single-nucleotide polymorphism (SNP) marker discovery and association analysis with the eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content in Larimichthys crocea. PeerJ 4:e2664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu S (2003) Estimating polygenic effects using markers of the entire genome. Genetics 163:789–801

    CAS  PubMed  PubMed Central  Google Scholar 

  • Yi N, Xu S (2008) Bayesian LASSO for quantitative trait loci mapping. Genetics 179:1045–1055

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yi N, George V, Allison DB (2003) Stochastic search variable selection for identifying multiple quantitative trait loci. Genetics 164:1129–1138

    CAS  PubMed  PubMed Central  Google Scholar 

  • Yu X, Meuwissen TH (2011) Using the Pareto principle in genome-wide breeding value estimation. Genet Sel Evol 43:1–7

    Article  Google Scholar 

  • Zhang Z, Liu J, Ding X et al (2010) Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix. PLoS ONE 5:e12648

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Shijun Xiao and Zhaofang Han performed the SNP calling. Kun Ye, Qingkai Chen, Junwei Chen, Yang Liu and other colleagues in the laboratory participated in DNA extraction and traits measurement. The work was supported by China Agriculture Research System (CARS-47-G04), Key projects of the Xiamen Southern Ocean Research Centre (14GZY70NF34) and the National Natural Science Foundation of China (U1205122).

Funding

The work was supported by China Agriculture Research System (CARS-47-G04), Key projects of the Xiamen Southern Ocean Research Centre (14GZY70NF34) and the National Natural Science Foundation of China (U1205122).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiyong Wang.

Ethics declarations

Conflict of interest

Authors LD and ZW declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dong, L., Wang, Z. Genomic prediction using an iterative conditional expectation algorithm for a fast BayesC-like model. Genetica 146, 361–368 (2018). https://doi.org/10.1007/s10709-018-0027-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10709-018-0027-x

Keywords

Navigation