Skip to main content
Log in

A gene-based test of association through an orthogonal decomposition of genotype scores

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

The burden test and the sequence kernel association test (SKAT) are two popular methods for detecting association with rare variants. Treated as two different sources of association information, they are adaptively combined to form an optimal SKAT (SKAT-O) method for optimal power. We show that the burden test is part of rather than independent of the SKAT. We introduce a new test statistic that is the sum of the burden statistic and a statistic asymptotically independent of the burden statistic. The performance of this new test statistic is demonstrated through extensive simulation studies and applications to a Genetic Analysis Workshop 17 data set and the Ocular Hypertension Treatment Study data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bacanu S-A, Nelson MR, Whittaker JC (2012) Comparison of statistical tests for association between rare variants and binary traits. PLoS One 7:e42530

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Basu S, Pan W (2011) Comparison of statistical tests for disease association with rare variants. Genet Epidemiol 35:606–619

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen Z (2011a) Is the weighted z-test the best method for combining probabilities from independent tests? J Evol Biol 24:926–930

    Article  CAS  PubMed  Google Scholar 

  • Chen Z (2011b) A new association test based on Chi-square partition for case-control GWA studies. Genet Epidemiol 35:658–663

    Article  PubMed  Google Scholar 

  • Chen Z (2013) Association tests through combining p-values for case control genome–wide association studies. Stat Probab Lett 83:1854–1862

    Article  Google Scholar 

  • Chen Z (2014) A new association test based on disease allele selection for case-control genome-wide association studies. BMC Genomics 15:358

    Article  PubMed  PubMed Central  Google Scholar 

  • Chen Z (2017) Testing for gene-gene interaction in case-control GWAS. Stat Interface 10:267–277

    Article  Google Scholar 

  • Chen Z, Nadarajah S (2014) On the optimally weighted z-test for combining probabilities from independent studies. Comput Stat Data Anal 70:387–394

    Article  Google Scholar 

  • Chen Z, Ng HKT (2012) A robust method for testing association in genome-wide association studies. Hum Hered 73:26–34

    Article  PubMed  Google Scholar 

  • Chen Z, Huang H, Ng HKT (2012) Design and analysis of multiple diseases genome-wide association studies without controls. Gene 510:87–92

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen Z, Huang H, Ng HKT (2014a) An improved robust association test for GWAS with multiple diseases. Stat Probab Lett 91:153–161

    Article  Google Scholar 

  • Chen Z, Yang W, Liu Q, Yang JY, Li J, Yang MQ (2014b) A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study. BMC Bioinform 15(Suppl 17):S3

    Article  Google Scholar 

  • Chen Z, Huang H, Ng HKT (2016a) Testing for association in case-control genome-wide association studies with shared controls. Stat Methods Med Res 25:954–967

    Article  PubMed  Google Scholar 

  • Chen Z, Huang H, Qiu P (2016b) Comparison of multiple hazard rate functions. Biometrics 72:39–45

    Article  PubMed  Google Scholar 

  • Chen Z, Han S, Wang K (2017a) Genetic association test based on principal component analysis Applications. Genet Mol Biol 16:189–198

    Google Scholar 

  • Chen Z, Huang H, Qiu P (2017b) An improved two-stage procedure to compare hazard curves. J Stat Comput Simul 87:1877–1886

    Article  Google Scholar 

  • Chen Z, Ng HKT, Li J, Liu Q, Huang H (2017c) Detecting associated single-nucleotide polymorphisms on the X chromosome in case control genome-wide association studies. Stat Methods Med Res 26:567–582

    Article  PubMed  Google Scholar 

  • Fisher RA (ed) (1932) Statistical methods for research workers. Oliver and Boyd, Edinburgh

    Google Scholar 

  • Gordon MO, Kass MA (1999) The ocular hypertension treatment study: design and baseline description of the participants. Arch Ophthalmol 117:573–583

    Article  CAS  PubMed  Google Scholar 

  • Lee S, Wu MC, Lin X (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13:762–775

    Article  PubMed  PubMed Central  Google Scholar 

  • Li B, Leal SM (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83:311–321

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lin D-Y, Tang Z-Z (2011) A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 89:354–367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Neale BM et al (2011) Testing for an unusual distribution of rare variants. PLoS Genet 7:e1001322

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pan W, Kim J, Zhang Y, Shen X, Wei P (2014) A powerful and adaptive association test for rare variants. Genetics 197:1081–1095

    Article  PubMed  PubMed Central  Google Scholar 

  • Sun YV, Sung YJ, Tintle N, Ziegler A (2011) Identification of genetic association of multiple rare variants using collapsing methods. Genet Epidemiol 35(Suppl 1):S101–S106

    Article  PubMed  PubMed Central  Google Scholar 

  • Sun J, Zheng Y, Hsu L (2013) A unified mixed-effects model for rare-variant association in sequencing studies. Genet Epidemiol 37:334–344

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang K (2016) Boosting the power of the sequence kernel association test by properly estimating its null distribution. Am J Hum Genet 99:104–114

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89:82–93

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu B, Pankow JS, Guan W (2015) Sequence kernel association analysis of rare variant set based on the marginal regression model for binary traits. Genet Epidemiol 39:399–405

    Article  PubMed  PubMed Central  Google Scholar 

  • Yi N, Zhi D (2011) Bayesian analysis of rare variants in genetic association studies. Genet Epidemiol 35:57–69

    Article  PubMed  PubMed Central  Google Scholar 

  • Zang Y, Fung WK, Zheng G (2010) Simple algorithms to calculate the asymptotic null distributions of robust tests in case-control genetic association studies in R. J Stat Softw 33:1–24

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the editor and three anonymous referees for their insightful comments which resulted in a substantial improvement of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TXT 1 kb)

Appendix: Proofs

Appendix: Proofs

Proof of Proposition 2.1

Under the model assumption, asymptotically, \( \tilde{y}\sim N(0,1) \). The covariance between \( Q_{\text{B}} \) and \( Q_{\text{s}} \) is

$$ \begin{aligned} {\text{Cov}}\left( {Q_{\text{B}} ,Q_{\text{s}} } \right) = 2{\text{trace}}\left[ {G11^{\varvec{T}} G^{T} GG^{T} } \right] = 2{\text{trace}}\left[ {G11^{\varvec{T}} G^{T} \mathop \sum \limits_{j = 1}^{m} \lambda_{j} v_{j} v_{j}^{T} } \right] \hfill \\ = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} {\text{trace}}\left[ {G11^{\varvec{T}} G^{T} v_{j} v_{j}^{T} } \right] = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} {\text{trace}}\left[ {1^{\varvec{T}} G^{T} v_{j} v_{j}^{T} G1} \right] = 2\mathop \sum \limits_{j = 1}^{m} \lambda_{j} (v_{j}^{T} G1)^{2} > 0. \hfill \\ \end{aligned} $$

Therefore, \( Q_{\text{B}} \) and \( Q_{\text{s}} \) are correlated in general.

Proof of Theorem 2.1

Since \( \tilde{G} = G - \tilde{v}_{0} \tilde{v}_{0}^{T} G \), \( \tilde{v}_{0} = \frac{G1}{{\sqrt {1^{T} G^{T} G1} }}, \) and \( \tilde{G}\tilde{G}^{T} \tilde{v}_{0} = 0 \). Therefore, \( \tilde{v}_{0} \) is orthogonal to the space spanned by the column vectors of \( \tilde{G}\tilde{G}^{T} \). Hence, for each \( \tilde{v}_{j} > 0 \), we have \( \tilde{v}_{j}^{T} \tilde{v}_{0} = 0 \). The covariance between \( \tilde{v}_{j}^{T} \tilde{y} \) and \( \tilde{v}_{k}^{T} \tilde{y} \) is \( {\text{Cov}} ( {\tilde{v}_{j}^{T} \tilde{y},\tilde{v}_{k}^{T} \tilde{y}} ) = \tilde{v}_{j}^{T} \tilde{v}_{k} = 0\; {\text{if}}\; j \ne k, \;{\text{and}}\; j,k = 0, 2, \ldots ,m \), since under the null hypothesis \( \tilde{v}_{j}^{T} \tilde{y} \) has asymptotic standard normal distribution: \( E[\tilde{v}_{j}^{T} \tilde{y}] = 0, \;{\text{and}}\; {\text{Var}}[\tilde{v}_{j}^{T} \tilde{y}] = 1. \)

Proof of Theorem 2.2

Under the null hypothesis, asymptotically, both p values P 1 and P 2 from Q 1, and Q 2, respectively, have uniform distribution between 0 and 1. Using quantile transformation as described in the text, both variables \( \left( {\chi_{1}^{2} } \right)^{ - 1} \left( {P_{i} } \right) (i = 1,2) \) asymptotically and independently follow a Chi-square distribution with a degree of freedom 1. Therefore, the new test statistic \( Q_{\text{new}} = \mathop \sum \nolimits_{i = 1}^{2} \left( {\chi_{1}^{2} } \right)^{ - 1} (P_{i} ) \) has an asymptotic Chi-square distribution with degrees of freedom 2.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Wang, K. A gene-based test of association through an orthogonal decomposition of genotype scores. Hum Genet 136, 1385–1394 (2017). https://doi.org/10.1007/s00439-017-1839-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-017-1839-y

Navigation