# Gene-Environment Independence in Case–Control Studies: Issues of Parameterization and Bayesian Inference

## Abstract

We consider the problem of exploiting the gene–environment independence (GEI) assumption in a case–control study inferring the joint effect of genotype and environmental exposure on disease risk. Specifically, we focus on the special case that both genotype and environmental exposure are binary. We note that the prospective intercept can sometimes be identified as a pair of “twin” values. Also, the GEI and general maximum-likelihood estimators of the gene–environment interaction coincide if the data cell proportions are directly compatible with the GEI assumption. Further, we approach the problem in a Bayesian framework by reweighing the general posterior subject to the prior specified over the subset of parameter space that is consistent with the GEI assumption. Some simulation studies have been conducted to compare the proposed method to its general counterpart. Finally, we have also extended the proposed method to address the concern that the GEI assumption may sometimes be violated.

### Keywords

Bayesian inference Binary covariates Case–control study Constrained posterior Gene–environment independence Identifiability## Supplementary material

### References

- 1.Prentice RL, Pyke R (1979) Logistic disease incidence models and case–control studies. Biometrika 66:403–411MathSciNetCrossRefMATHGoogle Scholar
- 2.Smith GD, Ebrahim S (2004) Mendelian randomization: prospects, potentials, and limitations. Int J Epidemiol 33:30–42CrossRefGoogle Scholar
- 3.Piegorsch WW, Weinberg CR, Taylor JA (1994) Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case–control studies. Stat Med 13:153–162CrossRefGoogle Scholar
- 4.Umbach DM, Weinberg CR (1997) Designing and analysing case–control studies to exploit independence of genotype and exposure. Stat Med 66:403–411Google Scholar
- 5.Chatterjee N, Carroll RJ (2005) Semiparametric maximum likelihood estimation exploiting gene–environment independence in case–control studies. Biometrika 92:399–418MathSciNetCrossRefMATHGoogle Scholar
- 6.Mukherjee B, Chatterjee N (2008) Exploiting gene–environment independence for analysis of case–control studies: an empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics 64:685–694MathSciNetCrossRefMATHGoogle Scholar
- 7.Mukherjee B, Ahn J, Gruber SB, Ghosh M, Chatterjee N (2010) Case–control studies of gene–environment interaction: Bayesian design and analysis. Biometrics 66:934–948MathSciNetCrossRefMATHGoogle Scholar
- 8.Chen HY, Chen J (2011) On information coded in gene–environment independence in case–control studies. Am J Epidemiol 174:736–743CrossRefGoogle Scholar
- 9.Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer-Verlag, New YorkCrossRefMATHGoogle Scholar
- 10.Hwang SJ, Beaty TH, Panny SR, Street NA, Joseph JM, Gordon S, McIntosh I, Francomano CA (1995) Association study of transforming growth factor alpha (TGFalpha) TaqI polymorphismand oral clefts: indication of gene–environment interaction in a population-based sample of infants with birth defects. Am J Epidemiol 141:629–636Google Scholar
- 11.Gu J, Liang D, Wang Y, Lu C, Wu X (2005) Effects of N-acetyl transferase 1 and 2 polymorphisms on bladder cancer risk in Caucasians. Mutat Res 581:97–104CrossRefGoogle Scholar
- 12.Gustafson P, Burstyn I (2011) Bayesian inference of gene–environment interaction from incomplete data: what happens when information on environment is disjoint from data on gene and disease? Stat Med 30:877–889MathSciNetCrossRefGoogle Scholar
- 13.Bhattacharjee S, Chatterjee N, Han S, Wheeler W (2012) CGEN: an R package for analysis of case–control studies in genetic epidemiology. R package version 2.2.0Google Scholar