Abstract
This study explored a semi-parametric method built upon reproducing kernels for estimating and testing the joint effect of a set of single nucleotide polymorphisms (SNPs). The kernel adopted is the identity-by-state kernel that measures SNP similarity between subjects. In this article, through simulations we first assessed its statistical power under different situations. It was found that in addition to the effect of sample size, the testing power was impacted by the strength of association between SNPs and the outcome of interest, and by the SNP similarity among the subjects. A quadratic relationship between SNP similarity and testing power was identified, and this relationship was further affected by sample sizes. Next we applied the method to a SNP-lung function data set to estimate and test the joint effect of a set of SNPs on forced vital capacity, one type of lung function measure. The findings were then connected to the patterns observed in simulation studies and further explored via variable importance indices of each SNP inferred from a variable selection procedure.
Similar content being viewed by others
References
Arshad SH, Hide DW (1992) Effect of environmental factors on the development of allergic disorders in infancy. J Allergy Clin Immunol 90:235–241
Breiman L (2001) Random forests. Mach Learn 45:5–32
Chatterjee R, Batra J, Das S, Sharma SK, Ghosh B (2008) Genetic association of acidic mammalian chitinase with atopic asthma and serum total ige levels. J Allergy Clin Immunol 122:202–208
Chin LJ, Ratner E, Leng S, Zhai R, Nallur S, Babar I et al. (2008) A SNP in a let-7 microRNA complementary site in the KRAS 3’ untranslated region increases non-small cell lung cancer risk. Cancer Res 68:8535–8540
Christiani DC, Sheu C, Chen F, Su L, Bajwa E et al. (2010) A large-scale genotyping study identifies five genes associated with ARDS development. Am J Respir Crit Care Med 181:A1024
Cook NR, Zee RYL, Ridker PM (2004) Tree and spline based association analysis of gene-gene interaction models for ischemic stroke. Stat Med 23:1439–1453
Eilers PHC, Marx BD (1996) Flexible smoothing with b splines and penalties. Stat Sci 11:89–121
Gianola D, van Kaam JBCHM (2008) Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178:2289–2303
Gonzalez-Recio O, Gianola D, Long N, Weigel KA, Rosa GJM, Avendano S (2008) Nonparametric methods for incorporating genomic information into genetic evaluations: an application to mortality in broilers. Genetics 178:2305–2313
Guo W, Lin S (2009) Generalized linear modeling with regularization for detecting common disease rare haplotype association. Genet Epidemiol 13:308–316
Hansel NN, Gao L, Rafaels NM, Mathias RA, Neptune ER, Tankersley C, Grant AV, Connett J, Beaty TH, Wise RA, Barnes KC (2009) Leptin receptor polymorphisms and lung function decline in COPD. Eur Respir J 34:103–110
Huebner M, Kim DY, Ewart S, Karmaus W, Sadeghnejad A, Arshad SH (2008) Patterns of GATA3 and IL13 gene polymorphisms associated with childhood rhinitis and atopy in a birth cohort. J Allergy Clin Immunol 121:408–414
Jiang R, Tang W, Wu X, Fu W (2009) A random forest approach to the detection of epistatic interactions in case-control studies. Bioinformatics 10(Suppl 1):S65
Lin HY, Wang W, Liu YH, Soong SJ, York TP, Myers L, Hu JJ (2008) Comparison of multivariate adaptive regression splines and logistic regression in detecting SNP-SNP interactions and their application in prostate cancer. J Hum Genet 53:802–811
Liu D, Lin X, Ghosh D (2007) Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63:1079–1088
Maity A, Lin X (2011) Powerful tests for detecting a gene effect in the presence of possible genegene interactions using garrote kernel machines. Biometrics 67(4):1271–1284
Maity A, Zhang H, Karmaus W, Ewart S (2012) A marker selection algorithm using kernel machine regression (unpublished manuscript)
Sadeghnejad A, Ohar JA, Zheng SL, Sterling DA, Hawkins GA, Meyers DA, Bleecker ER (2009) Adam33 polymorphisms are associated with COPD and lung function in long-term tobacco smokers. Respir Res 10
Wesse J, Schork NJ (2006) Generalized genomic distance-based regression methodology for multilocus association analysis. Am J Hum Genet 79:792–806
Wu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X (2010) Powerful SNP-set analysis for case-control genome-wide association studies. Am J Hum Genet 86:929–942
Yeager M, Orr N, Hayes RB, Jacobs KB, Kraft P et al. (2007) Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39:645–649
Acknowledgments
This work was supported by the National Institute of Allergy and Infectious Diseases (H. He, H. Zhang, and W. Karmaus, Grant number R01 AI091905) and by the National Institute of Environmental Health Sciences (A. Maity, Urant number R00 ES017744).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
He, H., Zhang, H., Maity, A. et al. Power of a reproducing kernel-based method for testing the joint effect of a set of single-nucleotide polymorphisms. Genetica 140, 421–427 (2012). https://doi.org/10.1007/s10709-012-9690-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10709-012-9690-5