Abstract
In genetic studies of complex diseases, a crucial task is to identify and quantify gene–gene interactions which are often defined as deviance from genetic additive effects. This statistical definition, however, does not need to reflect the biological interactions of genes. We propose a new method to detect gene–gene interactions. This new approach exploits the concept of synergy and antagonism that is appropriate to capture biological relationships. The conditional synergy index (CSI) describes the extent of interaction on the penetrance scale. We develop the CSI for two-locus disease models and cohort data. The index assumes genotypes to be dichotomized into risk-genotypes (exposed) and non-risk-genotypes (unexposed) but it does not assume the loci to be in linkage equilibrium. We investigate the performance of the CSI and compare it to classical epidemiological interaction measures like Rothman’s synergy index (S) and the attributable proportion due to interaction (AP). In addition, the performance of an estimator of this new parameter is illustrated in a practical example.
Similar content being viewed by others
Abbreviations
- ADD :
-
Additive penetrance model
- AP :
-
Attributable proportion due to interaction
- CAPN5:
-
Calpain 5
- CSI :
-
Conditional synergy index
- EPI ind :
-
Individual epistatic effects model
- EPI rr :
-
Double recessive epistatic effects model
- HET :
-
Heterogeneity penetrance model
- MULT :
-
Multiplicative penetrance model
- PPARD :
-
Peroxisome proliferator-activated receptor δ
- S :
-
Synergy index
References
Cao M-Q, Hu S-Y, Zhang C-H, Xia, D-S. Study on the interrelationship between 5-HTTLPR/G-protein [beta]3 subunit (C825T) polymorphisms and depressive disorder. Psychiatr Genet. 2007;17:233–8.
Bondy B. Common genetic factors for depression and cardiovascular disease. Dialogues Clin Neurosci. 2007;9:19–28.
Cox NJ, Frigge M, Nicolae DL, Concannon P, Hanis CL, Bell GI, Kong A. A loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans. Nat Genet. 1999;21:213–5.
Wiltshire S, Bell JT, Groves CJ, Dina C, Hattersley AT, Frayling TM, Walker M, Hitman GA, Vaxillaire M, Farrall M, Froguel P, McCarthy MI. Epistasis between type 2 diabetes susceptibility loci on chromosomes 1q21-25 and 10q23-26. Ann Hum Genet. 2006;70:726–37.
Stoll M, Corneliussen B, Costello CM, Waetzig GH, Mellgard B, Koch WA, Rosenstiel P, Albrecht M, Croucher PJ, Seegert D, Nikolaus S, Hampe J, Lengauer T, Pierrou S, Foelsch UR, Mathew CG, Lagerstrom-Fermer M, Schreiber S. Genetic variation in DLG5 is associated with inflammatory bowel disease. Nat Genet. 2004;36:476–80.
Cordell HJ. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 2002; 11:2463–8.
Moore JH. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered. 2003;56:73–82.
Gauderman WJ, Siegmund KD. Gene-environment interaction and affected sib pair linkage analysis. Hum Hered. 2001;52:34–46.
North BV, Curtis D, Sham PC. Application of logistic regression to case-control association studies involving two causative loci. Hum Hered. 2005;56:79–87.
Foraita R, Bammann K, Pigeot I. Modeling gene-gene interactions using graphical chain models. Hum Hered. 2008;65:47–56.
Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–47.
Cupples LA, Bailey J, Cartier KC, Falk CT, Liu KY, Ye Y, Yu R, Zhang H, Zhao H. Data mining. Genet Epidemol. 2005;29(suppl 1):S103–9.
Chatterjee N, Carroll RJ. Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies. Biometrika. 2005;92:399–418.
Park MY, Hastie T. Penalized logistic regression for detecting gene interactions. Biostatistics. 2008;9:30–50.
Huang J, Lin A, Narasimhan B, Quertermous T, Hsiung CA, Ho L-T, Grove JS, Olivier M, Ranade K, Risch NJ, Olshen RA. Tree-structured supervised learning and the genetics of hypertension. Proc Natl Acad Sci. 2004;101:10529–34.
Millstein J, Conti DV, Gilliland FD, Gauderman WJ. A testing framework for identifying susceptibility genes in the presence of epistasis. Am J Hum Genet. 2006;78:15–27
Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol. 1981;10:383–7.
Caliebe A, Freitag S, Krawczak M. Stochastische Modelle für Interaktion und Effektmodifikation. Med Genetik. 2005;17:1.
Porta M, Greenland S, Last JM, editors. A dictionary of epidemiology. 5th ed. New York: Oxford University Press; 2008.
Risch N. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet. 1990;46:222–8.
Rothman KJ. Synergy and antagonism in cause-effect relationships. Am J Epidemiol. 1974;99:385–8.
Rothman KJ. The estimation of synergy and antagonism. Am J Epidemiol. 1976;103:506–11.
Rothman KJ. Modern epidemiology. Boston: Little, Brown; 1986.
Strauch K, Fimmers R, Baur MP, Wienker TF. How to model a complex trait: analysis with two disease loci. Hum Hered. 2003;56:200–11.
Vermeulen SH, Den Heijer M, Sham P, Knight J. Application of multi-locus analytical methods to identify interacting loci in case-control studies. Ann Hum Genet. 2007;71:689–700.
Günther F, Foraita R. Ein Kriterium zur Identifikation einer biologischen Gen-Gen-Interaktion. In: Freyer G, Bibler K-E, editors. Biometrische Aspekte der Genomanalyse III. Aachen: Shaker-Verlag; 2007. p. 79–84.
Skrondal A. Interaction as departure from additivity in case-control studies: A cautionary note. Am J Epidemiol. 2003;158:251–8.
Sáez ME, Grilo A, Morón FJ, Manzano L, Marinez-Larrad MT, González-Pérez A, Srrano-Hernando J, Ruiz A, Ramirez-Lorca R, Serrano-Rios M. Interaction between Calpain 5, Peroxisome proliferator-activated receptor-gamma and Peroxisome proliferator-activated receptor-delta genes: a polygenic approach to obesity. Cardiovasc Diabetol. 2008;25:7–23.
Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand J Work Environ Health. 1988;14:125–9.
Darroch J. Biologic synergism and parallelism. Am J Epidemiol. 1997;145:661–8.
Gail MH, Pee D, Benichou J, Carroll R. Designing studies to estimate the penetrance of an identified autosomal dominant mutation: cohort, case-control, and genotyped-proband designs. Genet Epidemol. 1999;16:15–39.
Foraita R, Sobotka F. gmvalid: validation of graphical models. R package version 1.2. http://cran.r-project.org/web/packages/gmvalid/index.html (2008).
Acknowledgments
This research was supported by the grant PI 345/2-1 from the German Research Foundation (DFG). I am grateful to Iris Pigeot for her constructive comments that greatly helped to improve the manuscript.
Author information
Authors and Affiliations
Corresponding author
Technical appendix
Technical appendix
Let us assume an independent identically distributed (iid) sequence of multinomially distributed random vectors \({\bf X}=(X_{000},\ldots, X_{JKL})\) with parameters \(P(D=j,G_A=k,G_B=l) = p_{jkl}, p_{jkl} \in (0,1),\,j=0,\ldots,J,\) k = 0,…, K, l = 0,…, L, and sample sizes N jkl for each combination j, k, l, \(\sum_{j=0}^J \sum_{k=0}^K \sum_{l=0}^L N_{jkl} = N.\) The unbiased maximum-likelihood (ML) estimator of p jkl is given as \(\hat{p}_{jkl}= \frac{X_{jkl}}{N}\) with \({var}(\hat{p}_{jkl})=p_{jkl}(1-p_{jkl})/N.\)
Consistency of \(\widehat{CSI}\)
Theorem 1 Assuming the multinomial model described above and K = L = 1, \(\widehat{CSI}\) converges almost surely to CSI for N → ∞ and \(\lim_{N \rightarrow \infty} \hat{\delta}_{dab} \neq 0.\)
Proof It holds for the ML estimators \(\hat{p}\): \(\lim_{N \rightarrow \infty} \hat{{\bf p}} = \lim_{N \rightarrow \infty} \left(\frac{X_{000}}{N}, \ldots, \frac{X_{111}}{N} \right) = {\bf p}\) , j, k, l = 0,1. Since \(\hat{\delta}_{dab}\) is a continuous function of \(\hat{{\bf p}}_{jkl}\) Slutsky’s Theorem can be applied which yields
It follows directly that
□
Variance of \(\widehat{CSI}\)
To find the variance of \(CSI(\hat{{\bf p}}),\) we use the standard delta method based on a stochastic expansion of CSI at \({\bf p},\) which yields:
where
and \({\mathcal{I}}_j, {\mathcal{I}}_{k> l}\) are indicator functions that assign the values 0 and 1 as follows:
Rights and permissions
About this article
Cite this article
Foraita, R. A conditional synergy index to assess biological interaction. Eur J Epidemiol 24, 485–494 (2009). https://doi.org/10.1007/s10654-009-9378-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10654-009-9378-z