Skip to main content

Advertisement

Log in

Genome-wide barebones regression scan for mixed-model association analysis

  • Original Article
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Based on the simplified FaST-LMM, wherein genomic variance is replaced with heritability, we have significantly improved computational efficiency by implementing rapid R/fastLmPure to statistically infer the genetic effects of tested SNPs and focus on large or highly significant SNPs obtained using the EMMAX algorithm.

Abstract

For a genome-wide mixed-model association analysis, we introduce a barebones linear model fitting function called fastLmPure from the R/RcppArmadillo package for the rapid estimation of single nucleotide polymorphism (SNP) effects and the maximum likelihood values of factored spectrally transformed linear mixed models (FaST-LMM). Starting from the estimated genomic heritability of quantitative traits under a null model without quantitative trait nucleotides, maximum likelihood estimations of the polygenic heritabilities of candidate markers consume the same time as approximately four rounds of genome-wide regression scans. When focusing only on SNPs with large effects or high significance levels, as estimated by the efficient mixed-model association expedited algorithm, the run time of genome-wide mixed-model association analysis is reduced to at most two rounds of genome-wide regression scans. We have developed a novel software application called Single-RunKing to transform nonlinear mixed-model association analyses into barebones linear regression scans. Based on a realised relationship matrix calculated using genome-wide markers, Single-RunKing saves significantly computation time, as compared with the FaST-LMM that optimises the variance ratios of polygenic variances to residual variances using the R/lm function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Aulchenko YS, de Koning DJ, Haley C (2007) Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177:577–585

    Article  CAS  Google Scholar 

  • Goddard ME, Wray NR, Verbyla K, Visscher PM (2009) Estimating effects and making predictions from genome-wide marker data. Stat Sci 24:517–529

    Article  Google Scholar 

  • Hayes BJ, Visscher PM, Goddard ME (2009) Increased accuracy of artificial selection by using the realized relationship matrix. Genet Res 91:47–60

    Article  CAS  Google Scholar 

  • Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723

    Article  Google Scholar 

  • Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42:348–354

    Article  CAS  Google Scholar 

  • Lippert C, Listgarten J, Liu Y, Kadie CM, Davidson RI, Heckerman D (2011) FaST linear mixed models for genome-wide association studies. Nat Methods 8:833–835

    Article  CAS  Google Scholar 

  • Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D (2012) Improved linear mixed models for genome-wide association studies. Nat Methods 9:525–526

    Article  CAS  Google Scholar 

  • Loh PR, Tucker G, Bulik-Sullivan BK, Vilhjalmsson BJ, Finucane HK, Salem RM, Chasman DI, Ridker PM, Neale BM, Berger B, Patterson N, Price AL (2015) Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 47:284–290

    Article  CAS  Google Scholar 

  • Parker CC, Gopalakrishnan S, Carbonetto P, Gonzales NM, Leung E, Park YJ, Aryee E, Davis J, Blizard DA, Ackert-Bicknell CL (2016) Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice. Nat Genet 48:919

    Article  CAS  Google Scholar 

  • Patterson HD, Thompson R (1971) Recovery of Inter-block information when block sizes are unequal. Biometrika 58:545–554

    Article  Google Scholar 

  • Romay MC, Millard MJ, Glaubitz JC, Peiffer JA, Swarts KL, Casstevens TM, Elshire RJ, Acharya CB, Mitchell SE, Flintgarcia SA (2013) Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol 14:R55

    Article  Google Scholar 

  • Svishcheva GR, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS (2012) Rapid variance components-based method for whole-genome association analysis. Nat Genet 44:1166–1170

    Article  CAS  Google Scholar 

  • Vanraden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414–4423

    Article  CAS  Google Scholar 

  • Vanraden PM, Tassell CPV, Wiggans GR, Sonstegard TS, Schanabel RD, Taylor JF, Schenkel FS (2009) Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci 92:16–24

    Article  CAS  Google Scholar 

  • Wang Q, Tian F, Pan Y, Buckler ES, Zhang Z (2014) A SUPER powerful method for genome-wide association study. PLoS ONE 9:e107684

    Article  Google Scholar 

  • Yang J, Benyamin B, Mcevoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42:565–569

    Article  CAS  Google Scholar 

  • Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL (2014) Advantages and pitfalls in the application of mixed-model association methods. Nat Genet 46:100–106

    Article  Google Scholar 

  • Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208

    Article  CAS  Google Scholar 

  • Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, Bradbury PJ, Yu J, Arnett DK, Ordovas JM (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42:355–360

    Article  CAS  Google Scholar 

  • Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44:821–824

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by the Special Scientific Research Funds for Central Non-profit Institutes, Chinese Academy of Fishery Sciences (2017A001), and China Marine Fish Industry Technology System (CARS-47-G02).

Author information

Authors and Affiliations

Authors

Contributions

RQY conceived the proposed method and designed the associated computer software. JG and ZYH developed the source code. XFZ and LJ collected real datasets and participated in simulations and case analyses. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Runqing Yang.

Additional information

Communicated by Mikko J. Sillanpaa.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, J., Zhou, X., Hao, Z. et al. Genome-wide barebones regression scan for mixed-model association analysis. Theor Appl Genet 133, 51–58 (2020). https://doi.org/10.1007/s00122-019-03439-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-019-03439-5

Navigation