The impact of exposure-biased sampling designs on detection of gene–environment interactions in case–control studies with potential exposure misclassification

Stenzel, Stephanie L.; Ahn, Jaeil; Boonstra, Philip S.; Gruber, Stephen B.; Mukherjee, Bhramar

doi:10.1007/s10654-014-9908-1

The impact of exposure-biased sampling designs on detection of gene–environment interactions in case–control studies with potential exposure misclassification

METHODS
Published: 04 June 2014

Volume 30, pages 413–423, (2015)
Cite this article

European Journal of Epidemiology Aims and scope Submit manuscript

Stephanie L. Stenzel^1,2,5,
Jaeil Ahn³,
Philip S. Boonstra⁴,
Stephen B. Gruber⁵ &
…
Bhramar Mukherjee⁴

467 Accesses
17 Citations
2 Altmetric
Explore all metrics

Abstract

With limited funding and biological specimen availability, choosing an optimal sampling design to maximize power for detecting gene-by-environment (G–E) interactions is critical. Exposure-enriched sampling is often used to select subjects with rare exposures for genotyping to enhance power for tests of G–E effects. However, exposure misclassification (MC) combined with biased sampling can affect characteristics of tests for G–E interaction and joint tests for marginal association and G–E interaction. Here, we characterize the impact of exposure-biased sampling under conditions of perfect exposure information and exposure MC on properties of several methods for conducting inference. We assess the Type I error, power, bias, and mean squared error properties of case-only, case–control, and empirical Bayes methods for testing/estimating G–E interaction and a joint test for marginal G (or E) effect and G–E interaction across three biased sampling schemes. Properties are evaluated via empirical simulation studies. With perfect exposure information, exposure-enriched sampling schemes enhance power as compared to random selection of subjects irrespective of exposure prevalence but yield bias in estimation of the G–E interaction and marginal E parameters. Exposure MC modifies the relative performance of sampling designs when compared to the case of perfect exposure information. Those conducting G–E interaction studies should be aware of exposure MC properties and the prevalence of exposure when choosing an ideal sampling scheme and method for characterizing G–E interactions and joint effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Overlapping-sample Mendelian randomisation with multiple exposures: a Bayesian approach

Article Open access 07 December 2020

Analytical Complexity in Detection of Gene Variant-by-Environment Exposure Interactions in High-Throughput Genomic and Exposomic Research

Article Open access 25 January 2016

MRSamePopTest: introducing a simple falsification test for the two-sample mendelian randomisation ‘same population’ assumption

Article Open access 17 January 2024

Abbreviations

CC:: Case–control
CO:: Case-only
EB:: Empirical Bayes
G:: Genetic variant
E:: Environmental exposure
D:: Disease/outcome status
α_g :: Marginal log-odds ratio associated with the genetic factor
MA_g :: Marginal genetic association
α_e :: Marginal log-odds ratio associated with the environmental factor
MA_e :: Marginal environmental association
OR_ge :: Odds ratio for the association between the genetic and environmental variables in controls
β_g :: Main effect log-odds ratio associated with the genetic factor
OR_g :: Exp(β_g)
β_e :: Main effect log-odds ratio associated with the environmental factor
OR_e :: Exp(β_e)
β_g×e :: Gene by environment interaction log-odds ratio
OR_g×e :: Exp(β_g×e)

References

Hunter DJ. Gene–environment interactions in human diseases. Nat Rev Genet. 2005;6:287–98.
Article CAS PubMed Google Scholar
Thomas D. Gene–environment-wide association studies: emerging approaches. Nat Rev Genet. 2010;11:259–72.
Article CAS PubMed Central PubMed Google Scholar
Dai JY, Logsdon BA, Huang Y, et al. Simultaneously testing for marginal genetic association and gene–environment interaction. Am J Epidemiol. 2012;176:164–73.
Article PubMed Central PubMed Google Scholar
Kraft P, Yen YC, Stram DO, Morrison J, Gauderman WJ. Exploiting gene–environment interaction to detect genetic associations. Hum Hered. 2007;63:111–9.
Article CAS PubMed Google Scholar
Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case–control studies. Stat Med. 1994;13:153–62.
Article CAS PubMed Google Scholar
Chatterjee N, Carroll RJ. Semiparametric maximum likelihood estimation exploiting gene–environment independence in case–control studies. Biometrika. 2005;92:399–418.
Article Google Scholar
Mukherjee B, Chatterjee N. Exploiting gene–environment independence for analysis of case–control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics. 2008;64:685–94.
Article PubMed Google Scholar
Mukherjee B, Ahn J, Gruber SB, Chatterjee N. Testing gene–environment interaction in large-scale case–control association studies: possible choices and comparisons. Am J Epidemiol. 2012;175:177–90.
Article PubMed Central PubMed Google Scholar
Oexle K, Meitinger T. Sampling GWAS subjects from risk populations. Genet Epidemiol. 2011;35:148–53.
Article PubMed Google Scholar
Chen J, Kang G, Vanderweele T, Zhang C, Mukherjee B. Efficient designs of gene–environment interaction studies: implications of Hardy–Weinberg equilibrium and gene–environment independence. Stat Med. 2012;31:2516–30.
Article PubMed Central PubMed Google Scholar
Garcia-Closas M, Rothman N, Lubin J. Misclassification in case–control studies of gene–environment interactions: assessment of bias and sample size. Cancer Epidemiol Biomark Prev. 1999;8:1043–50.
CAS Google Scholar
Rothman N, Garcia-Closas M, Stewart WT, Lubin J. The impact of misclassification in case–control studies of gene–environment interactions. IARC Sci publ. 1999;148:89–96.
Garcia-Closas M, Thompson WD, Robins JM. Differential misclassification and the assessment of gene–environment interactions in case–control studies. Am J Epidemiol. 1998;147:426–33.
Article CAS PubMed Google Scholar
Lindstrom S, Yen YC, Spiegelman D, Kraft P. The impact of gene–environment dependence and misclassification in genetic association studies incorporating gene–environment interactions. Hum Hered. 2009;68:171–81.
Article PubMed Central PubMed Google Scholar
Carroll RJ, Gail MH, Lubin JH. Case–control studies with errors in covariates. J Am Stat Assoc. 1993;88:185–99.
Google Scholar
Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 111–38.
Breslow NE, Chatterjee N. Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis. J R Stat Soc Ser C (Appl Stat). 1999;48(4):457–68. doi:10.1111/1467-9876.00165.
Article Google Scholar
Lee AJ, Scott AJ, Wild CJ. Efficient estimation in multi-phase case–control studies. Biometrika. 2010;97(2):361–74. doi:10.1093/biomet/asq009.
Article Google Scholar
Lumley T. Survey: analysis of complex survey samples. R package version 3.2.4. 2011. Available online at:http://cran.r-project.org/web/packages/survey/index.html.
Cheng KF. Analysis of case-only studies accounting for genotyping error. Ann Hum Genet. 2007;71:238–48.
Article CAS PubMed Google Scholar
Wong MY, Day NE, Luan JA, Wareham NJ. Estimation of magnitude in gene–environment interactions in the presence of measurement error. Stat Med. 2004;23:987–98.
Article CAS PubMed Google Scholar
Greenland S. Statistical uncertainty due to misclassification: implications for validation substudies. J Clin Epidemiol. 1988;41:1167–74.
Article CAS PubMed Google Scholar
Zhang L, Mukherjee B, Ghosh M, Gruber S, Moreno V. Accounting for error due to misclassification of exposures in case–control studies of gene–environment interaction. Stat Med. 2008;27:2756–83.
Article PubMed Google Scholar
Rice K. Full-likelihood approaches to misclassification of a binary exposure in matched case–control studies. Stat Med. 2003;22:3177–94.
Article PubMed Google Scholar
Spiegelman DRB, Logan R. Estimation and inference for logistic regression with covariate misclassification and measurement error, in main study/validation study designs. J Am Stat Assoc. 2000;95:51–61.
Article Google Scholar
Lobach I, Fan R, Carroll RJ. Genotype-based association mapping of complex diseases: gene–environment interactions with multiple genetic markers and measurement error in environmental exposures. Genet Epidemiol. 2010;34:792–802.
Article PubMed Central PubMed Google Scholar
Lobach I, Mallick B, Carroll RJ. Semiparametric Bayesian analysis of gene–environment interactions with error in measurement of environmental covariates and missing genetic data. Stat Interface. 2011;4:305–16.
Article PubMed Central PubMed Google Scholar

Download references

Acknowledgments

Support for this study was provided by National Science Foundation DMS 1007494, National Institutes of Health ES 20811, National Institutes of Health CA 156608, and National Institutes of Health CA 148107. Funding for SLS was provided by the National Human Genome Research Institute at the National Institutes of Health (T32 HG00040), the National Institute of Environmental Health Sciences at the National Institutes of Health (T32 ES013678), and a fellowship from the University of Michigan Rackham Graduate School. Funding for PB and BM was partially provided by the University of Michigan Cancer Center Support Grant NIH P30 CA 046592.

Conflict of interest

The authors declare no conflicts of interest.

Author information

Authors and Affiliations

Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
Stephanie L. Stenzel
Department of Statistics, College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, USA
Stephanie L. Stenzel
Department of Biostatistics and Bioinformatics, Georgetown University, Washington, DC, USA
Jaeil Ahn
Department of Biostatistics, School of Public Health, University of Michigan, M4166 SPH II, 1415 Washington Heights, Ann Arbor, MI, 48109, USA
Philip S. Boonstra & Bhramar Mukherjee
Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, CA, USA
Stephanie L. Stenzel & Stephen B. Gruber

Authors

Stephanie L. Stenzel
View author publications
You can also search for this author in PubMed Google Scholar
Jaeil Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Boonstra
View author publications
You can also search for this author in PubMed Google Scholar
Stephen B. Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Bhramar Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stephanie L. Stenzel.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 624 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stenzel, S.L., Ahn, J., Boonstra, P.S. et al. The impact of exposure-biased sampling designs on detection of gene–environment interactions in case–control studies with potential exposure misclassification. Eur J Epidemiol 30, 413–423 (2015). https://doi.org/10.1007/s10654-014-9908-1

Download citation

Received: 18 April 2013
Accepted: 25 April 2014
Published: 04 June 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s10654-014-9908-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The impact of exposure-biased sampling designs on detection of gene–environment interactions in case–control studies with potential exposure misclassification

Abstract

Access this article

Similar content being viewed by others

Overlapping-sample Mendelian randomisation with multiple exposures: a Bayesian approach

Analytical Complexity in Detection of Gene Variant-by-Environment Exposure Interactions in High-Throughput Genomic and Exposomic Research

MRSamePopTest: introducing a simple falsification test for the two-sample mendelian randomisation ‘same population’ assumption

Abbreviations

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 624 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The impact of exposure-biased sampling designs on detection of gene–environment interactions in case–control studies with potential exposure misclassification

Abstract

Access this article

Similar content being viewed by others

Overlapping-sample Mendelian randomisation with multiple exposures: a Bayesian approach

Analytical Complexity in Detection of Gene Variant-by-Environment Exposure Interactions in High-Throughput Genomic and Exposomic Research

MRSamePopTest: introducing a simple falsification test for the two-sample mendelian randomisation ‘same population’ assumption

Abbreviations

References

Acknowledgments

Conflict of interest

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 624 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation