Abstract
We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions and gene by environment interactions in the same model. Our approach incorporates the natural hierarchical structure between the main effects and interaction effects into a mixture model, such that our methods tend to remove the irrelevant interaction effects more effectively, resulting in more robust and parsimonious models. We consider both strong and weak hierarchical models. For a strong hierarchical model, both the main effects between interacting factors must be present for the interactions to be considered in the model development, while for a weak hierarchical model, only one of the two main effects is required to be present for the interaction to be evaluated. Our simulation results show that the proposed strong and weak hierarchical mixture models work well in controlling false-positive rates and provide a powerful approach for identifying the predisposing effects and interactions in gene–environment interaction studies, in comparison with the naive model that does not impose this hierarchical constraint in most of the scenarios simulated. We illustrate our approach using data for lung cancer and cutaneous melanoma.
Similar content being viewed by others
References
Amos C, Wang L, Lee J, Gershenwald J, Chen W, Fang S, Kosoy R, Zhang M, Qureshi A, Vattathil S, Schacherer C, Gardneri J, Wang Y, Bishop D, Barrett J, Investigators G, Macgregor S, Hayward N, Martin N, Duffy D, Investigators QM, Mann G, Cust A, Hopper J, AMFS-Investigators, Brown K, Grimm E, Xu Y, Han Y, Jing K, McHugh C, Laurie C, Doheny K, Pugh E, Seldin M, Han J, Wei Q (2012) Genome-wide association study identifies novel loci predisposing to cutaneous melanoma. Huma Mol Genet 20:5012–5023
Aschard H, Chen J, Cornelis M, Chibnik L, Karlson E, Kraft P (2012) Inclusion of gene–gene and gene–environment interactions unlikely to dramatically improve risk prediction for complex diseases. Am J Hum Genet 90:962–972
Bien J, Taylori J, Tibshirani R (2013) A lasso for hierarchical interactions. Ann Stat 41:1111–1141
Chipman H (1996) Bayesian variable selection with related predictors. Can J Stat 24:17–36
Chipman H (2006) Prior distributions for Bayesian analysis of screening experiments. In: Dean A, Lewis S (eds) Screening: methods for experimentation in industry, drug discovery, and genetics. Springer, New York, pp 235–267
Chipman MH, Hamada N, Wu C (1997) A Bayesian variable selection approach for analyzing designed experiments with complex aliasing. Technometrics 39:372–381
Choi N, Li W, Zhu J (2010) Variable selection with the strong heredity constraint and its oracle property. J Am Stat Assoc 105:354–364
Cox D (1984) Interaction. Int Stat Rev 52:1–31
George E, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889
George E, McCulloch R (1997) Approaches for Bayesian variable selection. Statistica Sinica 7:339–373
Gu X, RF F, Rosner G, Relling M, Peng B, Amos C (2009) A modified forward multiple regression in high-density genome-wide association studies for complex traits. Genet Epidelmol 33:518–525
Hamada M, Wu C (1992) Analysis of designed experiments with complex aliasing. J Qual Technol 24:130–137
Hoggart C, Whittaker J, de Iorio M, Balding D (2008) Simulataneous analysis of snps in genome-wide and reseequencing association studies. PLoS Genet 4:e1000130
Manolio T (2010) Genome-wide association studies and disease risk assessment. N Engl J Med 363:166–176
Mitchell Y, Beauchamp J (1988) Bayesian variable selection in linear regression. J Am Stat Assoc 83:1023–1032
Ntzoufras I (2009) Bayesian modeling using WinBUGS. John Wiley & Sons, Hoboken, NJ
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103:681–686
Truong T, Hung R, Amos C, Wu X, Bickeböller H, Rosenberger A, Sauter W, Illig T, Wichmann H, Risch A, Dienemann H, Kaaks R (2010) Replication of lung cancer susceptibility loci at chromosomes 15q25, 5p15, and 6p21: a pooled analysis from the international lung cancer consortium. J Natl Cancer Inst 102:959–971
Wakefield J, De Vocht F, Hung R (2010) Bayesian mixture modeling of gene–environment and gene–gene interactions. Genet Epidemiol 34:16–25
Yi N, Banerjee S (2009) Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics 181:1101–1113
Yuan M, Joseph R, Zou H (2009) Structured variable selection and estimation. Ann Appl Stat 3:1738–1757
Yuani M, Joseph R, Lin Y (2007) An efficient variable selection approach for analyzing designed experiments. Technometrics 49:430–439
Acknowledgments
CIA has been supported by NIH Grant U19CA148127, P50CA093459, and P30CA023108. JM has been supported by NIH Grant R01CA134682. JM also acknowledges the support provided by the Biostatistics/ Epidemiology/ Research Design (BERD) component of the Center for Clinical and Translational Sciences (CCTS) for this project. CCTS is mainly funded by the NIH Centers for Translational Science Award (NIH CTSA) Grant (UL1 RR024148), awarded to University of Texas Health Science Center at Houston in 2006 by the National Center for Research Resources (NCRR) and its renewal (UL1 TR000371) by the National Center for Advancing Translational Sciences (NCATS).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Liu, C., Ma, J. & Amos, C.I. Bayesian variable selection for hierarchical gene–environment and gene–gene interactions. Hum Genet 134, 23–36 (2015). https://doi.org/10.1007/s00439-014-1478-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00439-014-1478-5