A gene-based test of association through an orthogonal decomposition of genotype scores

Chen, Zhongxue; Wang, Kai

doi:10.1007/s00439-017-1839-y

A gene-based test of association through an orthogonal decomposition of genotype scores

Original Investigation
Published: 01 September 2017

Volume 136, pages 1385–1394, (2017)
Cite this article

Human Genetics Aims and scope Submit manuscript

Zhongxue Chen¹ &
Kai Wang²

391 Accesses
8 Citations
Explore all metrics

Abstract

The burden test and the sequence kernel association test (SKAT) are two popular methods for detecting association with rare variants. Treated as two different sources of association information, they are adaptively combined to form an optimal SKAT (SKAT-O) method for optimal power. We show that the burden test is part of rather than independent of the SKAT. We introduce a new test statistic that is the sum of the burden statistic and a statistic asymptotically independent of the burden statistic. The performance of this new test statistic is demonstrated through extensive simulation studies and applications to a Genetic Analysis Workshop 17 data set and the Ocular Hypertension Treatment Study data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Considerations in the Analysis of Rare Variants

Links Between the Sequence Kernel Association and the Kernel-Based Adaptive Cluster Tests

Article 01 June 2017

A combined association test for rare variants using family and case-control data

Article Open access 18 October 2016

References

Bacanu S-A, Nelson MR, Whittaker JC (2012) Comparison of statistical tests for association between rare variants and binary traits. PLoS One 7:e42530
Article CAS PubMed PubMed Central Google Scholar
Basu S, Pan W (2011) Comparison of statistical tests for disease association with rare variants. Genet Epidemiol 35:606–619
Article PubMed PubMed Central Google Scholar
Chen Z (2011a) Is the weighted z-test the best method for combining probabilities from independent tests? J Evol Biol 24:926–930
Article CAS PubMed Google Scholar
Chen Z (2011b) A new association test based on Chi-square partition for case-control GWA studies. Genet Epidemiol 35:658–663
Article PubMed Google Scholar
Chen Z (2013) Association tests through combining p-values for case control genome–wide association studies. Stat Probab Lett 83:1854–1862
Article Google Scholar
Chen Z (2014) A new association test based on disease allele selection for case-control genome-wide association studies. BMC Genomics 15:358
Article PubMed PubMed Central Google Scholar
Chen Z (2017) Testing for gene-gene interaction in case-control GWAS. Stat Interface 10:267–277
Article Google Scholar
Chen Z, Nadarajah S (2014) On the optimally weighted z-test for combining probabilities from independent studies. Comput Stat Data Anal 70:387–394
Article Google Scholar
Chen Z, Ng HKT (2012) A robust method for testing association in genome-wide association studies. Hum Hered 73:26–34
Article PubMed Google Scholar
Chen Z, Huang H, Ng HKT (2012) Design and analysis of multiple diseases genome-wide association studies without controls. Gene 510:87–92
Article CAS PubMed PubMed Central Google Scholar
Chen Z, Huang H, Ng HKT (2014a) An improved robust association test for GWAS with multiple diseases. Stat Probab Lett 91:153–161
Article Google Scholar
Chen Z, Yang W, Liu Q, Yang JY, Li J, Yang MQ (2014b) A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study. BMC Bioinform 15(Suppl 17):S3
Article Google Scholar
Chen Z, Huang H, Ng HKT (2016a) Testing for association in case-control genome-wide association studies with shared controls. Stat Methods Med Res 25:954–967
Article PubMed Google Scholar
Chen Z, Huang H, Qiu P (2016b) Comparison of multiple hazard rate functions. Biometrics 72:39–45
Article PubMed Google Scholar
Chen Z, Han S, Wang K (2017a) Genetic association test based on principal component analysis Applications. Genet Mol Biol 16:189–198
Google Scholar
Chen Z, Huang H, Qiu P (2017b) An improved two-stage procedure to compare hazard curves. J Stat Comput Simul 87:1877–1886
Article Google Scholar
Chen Z, Ng HKT, Li J, Liu Q, Huang H (2017c) Detecting associated single-nucleotide polymorphisms on the X chromosome in case control genome-wide association studies. Stat Methods Med Res 26:567–582
Article PubMed Google Scholar
Fisher RA (ed) (1932) Statistical methods for research workers. Oliver and Boyd, Edinburgh
Google Scholar
Gordon MO, Kass MA (1999) The ocular hypertension treatment study: design and baseline description of the participants. Arch Ophthalmol 117:573–583
Article CAS PubMed Google Scholar
Lee S, Wu MC, Lin X (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762–775
Article PubMed PubMed Central Google Scholar
Li B, Leal SM (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83:311–321
Article CAS PubMed PubMed Central Google Scholar
Lin D-Y, Tang Z-Z (2011) A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 89:354–367
Article CAS PubMed PubMed Central Google Scholar
Neale BM et al (2011) Testing for an unusual distribution of rare variants. PLoS Genet 7:e1001322
Article CAS PubMed PubMed Central Google Scholar
Pan W, Kim J, Zhang Y, Shen X, Wei P (2014) A powerful and adaptive association test for rare variants. Genetics 197:1081–1095
Article PubMed PubMed Central Google Scholar
Sun YV, Sung YJ, Tintle N, Ziegler A (2011) Identification of genetic association of multiple rare variants using collapsing methods. Genet Epidemiol 35(Suppl 1):S101–S106
Article PubMed PubMed Central Google Scholar
Sun J, Zheng Y, Hsu L (2013) A unified mixed-effects model for rare-variant association in sequencing studies. Genet Epidemiol 37:334–344
Article PubMed PubMed Central Google Scholar
Wang K (2016) Boosting the power of the sequence kernel association test by properly estimating its null distribution. Am J Hum Genet 99:104–114
Article CAS PubMed PubMed Central Google Scholar
Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89:82–93
Article CAS PubMed PubMed Central Google Scholar
Wu B, Pankow JS, Guan W (2015) Sequence kernel association analysis of rare variant set based on the marginal regression model for binary traits. Genet Epidemiol 39:399–405
Article PubMed PubMed Central Google Scholar
Yi N, Zhi D (2011) Bayesian analysis of rare variants in genetic association studies. Genet Epidemiol 35:57–69
Article PubMed PubMed Central Google Scholar
Zang Y, Fung WK, Zheng G (2010) Simple algorithms to calculate the asymptotic null distributions of robust tests in case-control genetic association studies in R. J Stat Softw 33:1–24
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the editor and three anonymous referees for their insightful comments which resulted in a substantial improvement of the paper.

Author information

Authors and Affiliations

Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, 1025 E. 7th Street, Bloomington, IN, 47405, USA
Zhongxue Chen
Department of Biostatistics, N322 CPHB College of Public Health, University of Iowa, 145 N. Riverside Drive, Iowa City, IA, 52242, USA
Kai Wang

Authors

Zhongxue Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TXT 1 kb)

Appendix: Proofs

Proof of Proposition 2.1

Under the model assumption, asymptotically, $ \tilde{y}\sim N(0,1) $. The covariance between $ Q_{\text{B}} $ and $ Q_{\text{s}} $ is

$$ \begin{aligned} {\text{Cov}}\left( {Q_{\text{B}} ,Q_{\text{s}} } \right) = 2{\text{trace}}\left[ {G11^{\varvec{T}} G^{T} GG^{T} } \right] = 2{\text{trace}}\left[ {G11^{\varvec{T}} G^{T} \mathop \sum \limits_{j = 1}^{m} \lambda_{j} v_{j} v_{j}^{T} } \right] \hfill \\ = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} {\text{trace}}\left[ {G11^{\varvec{T}} G^{T} v_{j} v_{j}^{T} } \right] = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} {\text{trace}}\left[ {1^{\varvec{T}} G^{T} v_{j} v_{j}^{T} G1} \right] = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} (v_{j}^{T} G1)^{2} > 0. \hfill \\ \end{aligned} $$

Therefore, $ Q_{\text{B}} $ and $ Q_{\text{s}} $ are correlated in general.

Proof of Theorem 2.1

Since $ \tilde{G} = G - \tilde{v}_{0} \tilde{v}_{0}^{T} G $, $ \tilde{v}_{0} = \frac{G1}{{\sqrt {1^{T} G^{T} G1} }}, $ and $ \tilde{G}\tilde{G}^{T} \tilde{v}_{0} = 0 $. Therefore, $ \tilde{v}_{0} $ is orthogonal to the space spanned by the column vectors of $ \tilde{G}\tilde{G}^{T} $. Hence, for each $ \tilde{v}_{j} > 0 $, we have $ \tilde{v}_{j}^{T} \tilde{v}_{0} = 0 $. The covariance between $ \tilde{v}_{j}^{T} \tilde{y} $ and $ \tilde{v}_{k}^{T} \tilde{y} $ is $ {\text{Cov}} ( {\tilde{v}_{j}^{T} \tilde{y},\tilde{v}_{k}^{T} \tilde{y}} ) = \tilde{v}_{j}^{T} \tilde{v}_{k} = 0\; {\text{if}}\; j \ne k, \;{\text{and}}\; j,k = 0, 2, \ldots ,m $, since under the null hypothesis $ \tilde{v}_{j}^{T} \tilde{y} $ has asymptotic standard normal distribution: $ E[\tilde{v}_{j}^{T} \tilde{y}] = 0, \;{\text{and}}\; {\text{Var}}[\tilde{v}_{j}^{T} \tilde{y}] = 1. $

Proof of Theorem 2.2

Under the null hypothesis, asymptotically, both p values P ₁ and P ₂ from Q ₁, and Q ₂, respectively, have uniform distribution between 0 and 1. Using quantile transformation as described in the text, both variables $ \left( {\chi_{1}^{2} } \right)^{ - 1} \left( {P_{i} } \right) (i = 1,2) $ asymptotically and independently follow a Chi-square distribution with a degree of freedom 1. Therefore, the new test statistic $ Q_{\text{new}} = \mathop \sum \nolimits_{i = 1}^{2} \left( {\chi_{1}^{2} } \right)^{ - 1} (P_{i} ) $ has an asymptotic Chi-square distribution with degrees of freedom 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Z., Wang, K. A gene-based test of association through an orthogonal decomposition of genotype scores. Hum Genet 136, 1385–1394 (2017). https://doi.org/10.1007/s00439-017-1839-y

Download citation

Received: 30 June 2017
Accepted: 26 August 2017
Published: 01 September 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00439-017-1839-y

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A gene-based test of association through an orthogonal decomposition of genotype scores

Abstract

Access this article

Similar content being viewed by others

Statistical Considerations in the Analysis of Rare Variants

Links Between the Sequence Kernel Association and the Kernel-Based Adaptive Cluster Tests

A combined association test for rare variants using family and case-control data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Electronic supplementary material

Supplementary material 1 (TXT 1 kb)

Appendix: Proofs

Proof of Proposition 2.1

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Navigation

A gene-based test of association through an orthogonal decomposition of genotype scores

Abstract

Access this article

Similar content being viewed by others

Statistical Considerations in the Analysis of Rare Variants

Links Between the Sequence Kernel Association and the Kernel-Based Adaptive Cluster Tests

A combined association test for rare variants using family and case-control data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Electronic supplementary material

Supplementary material 1 (TXT 1 kb)

Appendix: Proofs

Appendix: Proofs

Proof of Proposition 2.1

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation