Genome-wide barebones regression scan for mixed-model association analysis

Gao, Jin; Zhou, Xuefei; Hao, Zhiyu; Jiang, Li; Yang, Runqing

doi:10.1007/s00122-019-03439-5

Genome-wide barebones regression scan for mixed-model association analysis

Original Article
Published: 24 September 2019

Volume 133, pages 51–58, (2020)
Cite this article

Theoretical and Applied Genetics Aims and scope Submit manuscript

Jin Gao¹^na1,
Xuefei Zhou²^na1,
Zhiyu Hao³,
Li Jiang⁴ &
…
Runqing Yang ORCID: orcid.org/0000-0002-7236-5178^1,4

823 Accesses
5 Citations
2 Altmetric
Explore all metrics

Abstract

Key message

Based on the simplified FaST-LMM, wherein genomic variance is replaced with heritability, we have significantly improved computational efficiency by implementing rapid R/fastLmPure to statistically infer the genetic effects of tested SNPs and focus on large or highly significant SNPs obtained using the EMMAX algorithm.

Abstract

For a genome-wide mixed-model association analysis, we introduce a barebones linear model fitting function called fastLmPure from the R/RcppArmadillo package for the rapid estimation of single nucleotide polymorphism (SNP) effects and the maximum likelihood values of factored spectrally transformed linear mixed models (FaST-LMM). Starting from the estimated genomic heritability of quantitative traits under a null model without quantitative trait nucleotides, maximum likelihood estimations of the polygenic heritabilities of candidate markers consume the same time as approximately four rounds of genome-wide regression scans. When focusing only on SNPs with large effects or high significance levels, as estimated by the efficient mixed-model association expedited algorithm, the run time of genome-wide mixed-model association analysis is reduced to at most two rounds of genome-wide regression scans. We have developed a novel software application called Single-RunKing to transform nonlinear mixed-model association analyses into barebones linear regression scans. Based on a realised relationship matrix calculated using genome-wide markers, Single-RunKing saves significantly computation time, as compared with the FaST-LMM that optimises the variance ratios of polygenic variances to residual variances using the R/lm function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performing Genome-Wide Association Studies with Multiple Models Using GAPIT

Efficient Bayesian mixed-model analysis increases association power in large cohorts

Article 02 February 2015

Further Improvements to Linear Mixed Models for Genome-Wide Association Studies

Article Open access 12 November 2014

References

Aulchenko YS, de Koning DJ, Haley C (2007) Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177:577–585
Article CAS Google Scholar
Goddard ME, Wray NR, Verbyla K, Visscher PM (2009) Estimating effects and making predictions from genome-wide marker data. Stat Sci 24:517–529
Article Google Scholar
Hayes BJ, Visscher PM, Goddard ME (2009) Increased accuracy of artificial selection by using the realized relationship matrix. Genet Res 91:47–60
Article CAS Google Scholar
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723
Article Google Scholar
Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42:348–354
Article CAS Google Scholar
Lippert C, Listgarten J, Liu Y, Kadie CM, Davidson RI, Heckerman D (2011) FaST linear mixed models for genome-wide association studies. Nat Methods 8:833–835
Article CAS Google Scholar
Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D (2012) Improved linear mixed models for genome-wide association studies. Nat Methods 9:525–526
Article CAS Google Scholar
Loh PR, Tucker G, Bulik-Sullivan BK, Vilhjalmsson BJ, Finucane HK, Salem RM, Chasman DI, Ridker PM, Neale BM, Berger B, Patterson N, Price AL (2015) Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 47:284–290
Article CAS Google Scholar
Parker CC, Gopalakrishnan S, Carbonetto P, Gonzales NM, Leung E, Park YJ, Aryee E, Davis J, Blizard DA, Ackert-Bicknell CL (2016) Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice. Nat Genet 48:919
Article CAS Google Scholar
Patterson HD, Thompson R (1971) Recovery of Inter-block information when block sizes are unequal. Biometrika 58:545–554
Article Google Scholar
Romay MC, Millard MJ, Glaubitz JC, Peiffer JA, Swarts KL, Casstevens TM, Elshire RJ, Acharya CB, Mitchell SE, Flintgarcia SA (2013) Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biol 14:R55
Article Google Scholar
Svishcheva GR, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS (2012) Rapid variance components-based method for whole-genome association analysis. Nat Genet 44:1166–1170
Article CAS Google Scholar
Vanraden PM (2008) Efficient methods to compute genomic predictions. J Dairy Sci 91:4414–4423
Article CAS Google Scholar
Vanraden PM, Tassell CPV, Wiggans GR, Sonstegard TS, Schanabel RD, Taylor JF, Schenkel FS (2009) Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci 92:16–24
Article CAS Google Scholar
Wang Q, Tian F, Pan Y, Buckler ES, Zhang Z (2014) A SUPER powerful method for genome-wide association study. PLoS ONE 9:e107684
Article Google Scholar
Yang J, Benyamin B, Mcevoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42:565–569
Article CAS Google Scholar
Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL (2014) Advantages and pitfalls in the application of mixed-model association methods. Nat Genet 46:100–106
Article Google Scholar
Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203–208
Article CAS Google Scholar
Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, Bradbury PJ, Yu J, Arnett DK, Ordovas JM (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42:355–360
Article CAS Google Scholar
Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44:821–824
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by the Special Scientific Research Funds for Central Non-profit Institutes, Chinese Academy of Fishery Sciences (2017A001), and China Marine Fish Industry Technology System (CARS-47-G02).

Author information

Jin Gao and Xuefei Zhou have contributed equally to this work.

Authors and Affiliations

Wuxi Fisheries College, Nanjing Agricultural University, Wuxi, 214081, China
Jin Gao & Runqing Yang
Zhongbo International Business School, Guangzhou Zhongbo Education Corporation Limited, Guangzhou, 511458, China
Xuefei Zhou
College of Animal Science and Technology, Northeast Agricultural University, Harbin, 150030, China
Zhiyu Hao
Research Centre for Aquatic Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China
Li Jiang & Runqing Yang

Authors

Jin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Xuefei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyu Hao
View author publications
You can also search for this author in PubMed Google Scholar
Li Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Runqing Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

RQY conceived the proposed method and designed the associated computer software. JG and ZYH developed the source code. XFZ and LJ collected real datasets and participated in simulations and case analyses. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Runqing Yang.

Additional information

Communicated by Mikko J. Sillanpaa.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, J., Zhou, X., Hao, Z. et al. Genome-wide barebones regression scan for mixed-model association analysis. Theor Appl Genet 133, 51–58 (2020). https://doi.org/10.1007/s00122-019-03439-5

Download citation

Received: 15 January 2019
Accepted: 17 September 2019
Published: 24 September 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s00122-019-03439-5

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Genome-wide barebones regression scan for mixed-model association analysis