Skip to main content
Log in

A conditional synergy index to assess biological interaction

  • METHODS
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

In genetic studies of complex diseases, a crucial task is to identify and quantify gene–gene interactions which are often defined as deviance from genetic additive effects. This statistical definition, however, does not need to reflect the biological interactions of genes. We propose a new method to detect gene–gene interactions. This new approach exploits the concept of synergy and antagonism that is appropriate to capture biological relationships. The conditional synergy index (CSI) describes the extent of interaction on the penetrance scale. We develop the CSI for two-locus disease models and cohort data. The index assumes genotypes to be dichotomized into risk-genotypes (exposed) and non-risk-genotypes (unexposed) but it does not assume the loci to be in linkage equilibrium. We investigate the performance of the CSI and compare it to classical epidemiological interaction measures like Rothman’s synergy index (S) and the attributable proportion due to interaction (AP). In addition, the performance of an estimator of this new parameter is illustrated in a practical example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Abbreviations

ADD :

Additive penetrance model

AP :

Attributable proportion due to interaction

CAPN5:

Calpain 5

CSI :

Conditional synergy index

EPI ind :

Individual epistatic effects model

EPI rr :

Double recessive epistatic effects model

HET :

Heterogeneity penetrance model

MULT :

Multiplicative penetrance model

PPARD :

Peroxisome proliferator-activated receptor δ

S :

Synergy index

References

  1. Cao M-Q, Hu S-Y, Zhang C-H, Xia, D-S. Study on the interrelationship between 5-HTTLPR/G-protein [beta]3 subunit (C825T) polymorphisms and depressive disorder. Psychiatr Genet. 2007;17:233–8.

    Article  PubMed  Google Scholar 

  2. Bondy B. Common genetic factors for depression and cardiovascular disease. Dialogues Clin Neurosci. 2007;9:19–28.

    PubMed  Google Scholar 

  3. Cox NJ, Frigge M, Nicolae DL, Concannon P, Hanis CL, Bell GI, Kong A. A loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans. Nat Genet. 1999;21:213–5.

    Article  PubMed  CAS  Google Scholar 

  4. Wiltshire S, Bell JT, Groves CJ, Dina C, Hattersley AT, Frayling TM, Walker M, Hitman GA, Vaxillaire M, Farrall M, Froguel P, McCarthy MI. Epistasis between type 2 diabetes susceptibility loci on chromosomes 1q21-25 and 10q23-26. Ann Hum Genet. 2006;70:726–37.

    Article  PubMed  CAS  Google Scholar 

  5. Stoll M, Corneliussen B, Costello CM, Waetzig GH, Mellgard B, Koch WA, Rosenstiel P, Albrecht M, Croucher PJ, Seegert D, Nikolaus S, Hampe J, Lengauer T, Pierrou S, Foelsch UR, Mathew CG, Lagerstrom-Fermer M, Schreiber S. Genetic variation in DLG5 is associated with inflammatory bowel disease. Nat Genet. 2004;36:476–80.

    Article  PubMed  CAS  Google Scholar 

  6. Cordell HJ. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 2002; 11:2463–8.

    Article  PubMed  CAS  Google Scholar 

  7. Moore JH. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered. 2003;56:73–82.

    Article  PubMed  Google Scholar 

  8. Gauderman WJ, Siegmund KD. Gene-environment interaction and affected sib pair linkage analysis. Hum Hered. 2001;52:34–46.

    Article  PubMed  CAS  Google Scholar 

  9. North BV, Curtis D, Sham PC. Application of logistic regression to case-control association studies involving two causative loci. Hum Hered. 2005;56:79–87.

    Article  Google Scholar 

  10. Foraita R, Bammann K, Pigeot I. Modeling gene-gene interactions using graphical chain models. Hum Hered. 2008;65:47–56.

    Article  PubMed  Google Scholar 

  11. Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–47.

    Article  PubMed  CAS  Google Scholar 

  12. Cupples LA, Bailey J, Cartier KC, Falk CT, Liu KY, Ye Y, Yu R, Zhang H, Zhao H. Data mining. Genet Epidemol. 2005;29(suppl 1):S103–9.

    Article  Google Scholar 

  13. Chatterjee N, Carroll RJ. Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studies. Biometrika. 2005;92:399–418.

    Article  Google Scholar 

  14. Park MY, Hastie T. Penalized logistic regression for detecting gene interactions. Biostatistics. 2008;9:30–50.

    Article  PubMed  Google Scholar 

  15. Huang J, Lin A, Narasimhan B, Quertermous T, Hsiung CA, Ho L-T, Grove JS, Olivier M, Ranade K, Risch NJ, Olshen RA. Tree-structured supervised learning and the genetics of hypertension. Proc Natl Acad Sci. 2004;101:10529–34.

    Article  PubMed  CAS  Google Scholar 

  16. Millstein J, Conti DV, Gilliland FD, Gauderman WJ. A testing framework for identifying susceptibility genes in the presence of epistasis. Am J Hum Genet. 2006;78:15–27

    Article  PubMed  CAS  Google Scholar 

  17. Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol. 1981;10:383–7.

    Article  PubMed  CAS  Google Scholar 

  18. Caliebe A, Freitag S, Krawczak M. Stochastische Modelle für Interaktion und Effektmodifikation. Med Genetik. 2005;17:1.

    Google Scholar 

  19. Porta M, Greenland S, Last JM, editors. A dictionary of epidemiology. 5th ed. New York: Oxford University Press; 2008.

    Google Scholar 

  20. Risch N. Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet. 1990;46:222–8.

    PubMed  CAS  Google Scholar 

  21. Rothman KJ. Synergy and antagonism in cause-effect relationships. Am J Epidemiol. 1974;99:385–8.

    PubMed  CAS  Google Scholar 

  22. Rothman KJ. The estimation of synergy and antagonism. Am J Epidemiol. 1976;103:506–11.

    PubMed  CAS  Google Scholar 

  23. Rothman KJ. Modern epidemiology. Boston: Little, Brown; 1986.

    Google Scholar 

  24. Strauch K, Fimmers R, Baur MP, Wienker TF. How to model a complex trait: analysis with two disease loci. Hum Hered. 2003;56:200–11.

    Article  PubMed  Google Scholar 

  25. Vermeulen SH, Den Heijer M, Sham P, Knight J. Application of multi-locus analytical methods to identify interacting loci in case-control studies. Ann Hum Genet. 2007;71:689–700.

    Article  PubMed  CAS  Google Scholar 

  26. Günther F, Foraita R. Ein Kriterium zur Identifikation einer biologischen Gen-Gen-Interaktion. In: Freyer G, Bibler K-E, editors. Biometrische Aspekte der Genomanalyse III. Aachen: Shaker-Verlag; 2007. p. 79–84.

    Google Scholar 

  27. Skrondal A. Interaction as departure from additivity in case-control studies: A cautionary note. Am J Epidemiol. 2003;158:251–8.

    Article  PubMed  Google Scholar 

  28. Sáez ME, Grilo A, Morón FJ, Manzano L, Marinez-Larrad MT, González-Pérez A, Srrano-Hernando J, Ruiz A, Ramirez-Lorca R, Serrano-Rios M. Interaction between Calpain 5, Peroxisome proliferator-activated receptor-gamma and Peroxisome proliferator-activated receptor-delta genes: a polygenic approach to obesity. Cardiovasc Diabetol. 2008;25:7–23.

    Google Scholar 

  29. Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand J Work Environ Health. 1988;14:125–9.

    PubMed  CAS  Google Scholar 

  30. Darroch J. Biologic synergism and parallelism. Am J Epidemiol. 1997;145:661–8.

    PubMed  CAS  Google Scholar 

  31. Gail MH, Pee D, Benichou J, Carroll R. Designing studies to estimate the penetrance of an identified autosomal dominant mutation: cohort, case-control, and genotyped-proband designs. Genet Epidemol. 1999;16:15–39.

    Article  CAS  Google Scholar 

  32. Foraita R, Sobotka F. gmvalid: validation of graphical models. R package version 1.2. http://cran.r-project.org/web/packages/gmvalid/index.html (2008).

Download references

Acknowledgments

This research was supported by the grant PI 345/2-1 from the German Research Foundation (DFG). I am grateful to Iris Pigeot for her constructive comments that greatly helped to improve the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ronja Foraita.

Technical appendix

Technical appendix

Let us assume an independent identically distributed (iid) sequence of multinomially distributed random vectors \({\bf X}=(X_{000},\ldots, X_{JKL})\) with parameters \(P(D=j,G_A=k,G_B=l) = p_{jkl}, p_{jkl} \in (0,1),\,j=0,\ldots,J,\) k = 0,…, K, l = 0,…, L, and sample sizes N jkl for each combination j, k, l, \(\sum_{j=0}^J \sum_{k=0}^K \sum_{l=0}^L N_{jkl} = N.\) The unbiased maximum-likelihood (ML) estimator of p jkl is given as \(\hat{p}_{jkl}= \frac{X_{jkl}}{N}\) with \({var}(\hat{p}_{jkl})=p_{jkl}(1-p_{jkl})/N.\)

Consistency of \(\widehat{CSI}\)

Theorem 1 Assuming the multinomial model described above and K = L = 1, \(\widehat{CSI}\) converges almost surely to CSI for N → ∞ and \(\lim_{N \rightarrow \infty} \hat{\delta}_{dab} \neq 0.\)

Proof It holds for the ML estimators \(\hat{p}\): \(\lim_{N \rightarrow \infty} \hat{{\bf p}} = \lim_{N \rightarrow \infty} \left(\frac{X_{000}}{N}, \ldots, \frac{X_{111}}{N} \right) = {\bf p}\) , j, k, l = 0,1. Since \(\hat{\delta}_{dab}\) is a continuous function of \(\hat{{\bf p}}_{jkl}\) Slutsky’s Theorem can be applied which yields

$$ \begin{aligned} \lim_{N \rightarrow \infty} \hat{\delta}_{dab} = &\lim_{N \rightarrow \infty} \frac{1}{N} \left[ \left(\sum_j X_{jab} - \sum_j X_{ja'b'}\right) \frac{X_{dab}} {\sum_j X_{jab}} \right.\\ &+ \left. \left(\sum_j \sum_l X_{ja'l}\right) \frac{X_{da'b}} {\sum_j X_{ja'b}} + \left(\sum_j \sum_k X_{jkb'}\right) \frac{X_{dab'}} {\sum_j X_{jab'}} \right] \\ =&\lim_{N \rightarrow \infty} \frac{1}{N} \left[ \left(\sum_j \hat{p}_{jab}N - \sum_j \hat{p}_{ja'b'}N \right) \frac{\hat{p}_{dab}N}{\sum_j \hat{p}_{jab}N}\right.\\ &+ \left.\left(\sum_j \sum_l \hat{p}_{ja'l}N \right) \frac{\hat{p}_{da'b}N} {\sum_j \hat{p}_{ja'b}N} + \left(\sum_j \sum_k \hat{p}_{jkb'}N \right) \frac{\hat{p}_{dab'}N} {\sum_j \hat{p}_{jab'}N} \right] \\ =& \left(\sum_j p_{jab} - \sum_j p_{ja'b'}\right) \frac{p_{dab}} {\sum_j p_{jab}} \\ &+ \left(\sum_j \sum_l p_{ja'l}\right) \frac{p_{da'b}} {\sum_j p_{ja'b}} + \left(\sum_j \sum_k p_{jkb'}\right) \frac{p_{dab'}}{\sum_j p_{jab'}}\\ =& \delta_{dab}. \end{aligned} $$

It follows directly that

$$ \lim_{N \rightarrow \infty} \widehat{CSI} = \lim_{N \rightarrow \infty} \left( \frac{\frac{\hat{\delta}_{111}} {\hat{\delta}_{011}} + \frac{\hat{\delta}_{100}} {\hat{\delta}_{000}}} {\frac{\hat{\delta}_{101}} {\hat{\delta}_{001}} + \frac{\hat{\delta}_{110}} {\hat{\delta}_{010}}} \right) = \frac{\frac{\lim_{N \rightarrow \infty}\hat{\delta}_{111}} {\lim_{N \rightarrow \infty}\hat{\delta}_{011}} + \frac{\lim_{N \rightarrow \infty}\hat{\delta}_{100}}{\lim_{N \rightarrow \infty}\hat{\delta}_{000}}} {\frac{\lim_{N \rightarrow \infty}\hat{\delta}_{101}} {\lim_{N \rightarrow \infty}\hat{\delta}_{001}} + \frac{\lim_{N \rightarrow \infty}\hat{\delta}_{110}} {\lim_{N \rightarrow \infty}\hat{\delta}_{010}}} = \frac{\frac{\delta_{111}} {\delta_{011}} + \frac{\delta_{100}} {\delta_{000}}} {\frac{\delta_{101}} {\delta_{001}} + \frac{\delta_{110}} {\delta_{010}}}= CSI. $$

Variance of \(\widehat{CSI}\)

To find the variance of \(CSI(\hat{{\bf p}}),\) we use the standard delta method based on a stochastic expansion of CSI at \({\bf p},\) which yields:

$$ \begin{aligned} {var}_A(\widehat{CSI})= &\frac{1}{N} \left( {\bf\phi}' {\bf diag(p)} {\bf\phi} - ({\bf\phi}' {\bf p})^2 \right)\\ =& \frac{1}{N} \sum_{j=0}^1 \sum_{k=0}^1 \sum_{l=0}^1 p_{jkl} \left(\frac{\partial g({\bf p})}{\partial p_{jkl}}\right)^2 - \frac{1}{N} \left(\sum_{j=0}^1 \sum_{k=0}^1 \sum_{l=0}^1 p_{jkl} \frac{\partial g({\bf p})}{\partial p_{jkl}}\right)^2\\ =& \frac{1}{N \left( Q_{10} + Q_{01} \right) ^2} \sum_{j=0}^1 \sum_{k=0}^1 \sum_{l=0}^1 p_{jkl}(1-p_{jkl}) \left(\pi^c_{jkl} + \pi^d_{jkl} \right)^2, \end{aligned} $$

where

$$ \begin{aligned} \pi^c_{jkl} =& \frac{\frac{-p_{1k'l'} +Q_{k'l'}p_{0k'l'}} {\sum_{j^*}p_{j^*k'l'}} + \frac{p_{1kl'} -Q_{k'l'}p_{0kl'}} {\sum_{j^*}p_{j^*kl'}} +\frac{p_{1k'l}-Q_{k'l'}p_{0k'l}} {\sum_{j^*}p_{j^*k'l}}}{\delta_{0k'l'}}\\ & + \frac{(-1)^{1-{\mathcal{I}}_j} \cdot \left({\mathcal{I}}_j + (1-{\mathcal{I}}_j) Q_{kl} - p_{j'kl}\frac{\sum_{j^*} p_{j^*k'l'}} {(\sum_{j^*}p_{j^*kl})^2}\left[1 +Q_{kl} \right] \right)} {\delta_{0kl}} \\ & - CSI \left((-1)^{1-{\mathcal{I}}_j} \cdot \frac{{\mathcal{I}}_j +(1-{\mathcal{I}}_j) Q_{kl'} + p_{j'kl} \frac{\sum_{j^*}p_{j^*k'l}}{(\sum_{j^*}p_{j^*kl})^2} \left[1 + Q_{kl'} \right]} {\delta_{0kl'}}\right.\\ & +\left.(-1)^{1-{\mathcal{I}}_j} \cdot\frac{{\mathcal{I}}_j + (1-{\mathcal{I}}_j) Q_{k'l} + p_{j'kl}\frac{\sum_{j^*}p_{j^*kl'}} {(\sum_{j^*}p_{j^*kl})^2} \left[1 +Q_{k'l} \right]} {\delta_{0k'l}} \right) \\ \pi^d_{jkl} =&(-1)^{1-{\mathcal{I}}_j} \cdot \frac{{\mathcal{I}}_j +(1-{\mathcal{I}}_j) Q_{kl'} + p_{j'kl} \frac{\sum_{j^*}p_{j^*k'l}}{(\sum_{j^*}p_{j^*kl})^2} \left[1 + Q_{kl'} \right]} {\delta_{0kl'}}\\ & +(-1)^{1-{\mathcal{I}}_j} \cdot \frac{{\mathcal{I}}_j +(1-{\mathcal{I}}_j) Q_{k'l} + p_{j'kl} \frac{\sum_{j^*}p_{j^*kl'}}{(\sum_{j^*}p_{j^*kl})^2} \left[1 + Q_{k'l} \right]} {\delta_{0k'l}}\\ &- CSI \left( {\mathcal{I}}_{k> l} \cdot \frac{\left[\frac{p_{1k'l'} - Q_{k'l'}p_{0k'l'}} {\sum_{j^*}p_{j^*k'l'}}\right]} {\delta_{0k'l'}} + (1- {\mathcal{I}}_{k> l})\frac{\left[\frac{-p_{1k'l'} + Q_{k'l'}p_{0k'l'}}{\sum_{j^*}p_{j^*k'l'}}\right]} {\delta_{0k'l'}} \right.\\ & +\frac{ \left[\frac{p_{1kl'} - Q_{k'l'}p_{0kl'}}{\sum_{j^*}p_{j^*kl'}} + \frac{p_{1k'l}-Q_{k'l'}p_{0k'l}}{\sum_{j^*}p_{j^*k'l}}\right]} {\delta_{0k'l'}} \\ & +\left.\frac{(-1)^{1-{\mathcal{I}}_j} \cdot \left(\right.{\mathcal{I}}_j +(1-{\mathcal{I}}_j) Q_{kl} - p_{j'kl} \frac{\sum_{j^*} p_{j^*k'l'}}{(\sum_{j^*}p_{j^*kl})^2} \left[1 + Q_{kl} \right]} {\delta_{0kl}}\right ), \end{aligned} $$

and \({\mathcal{I}}_j, {\mathcal{I}}_{k> l}\) are indicator functions that assign the values 0 and 1 as follows:

$$ {\mathcal{I}}_j = \left\{\begin{array}{ll} 1 & j=1\\ 0 & j=0 \end{array}\right. \quad \quad {\mathcal{I}}_{k> l} = \left\{\begin{array}{ll}1 & k> l \\ 0 & k< l\end{array}\right. $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Foraita, R. A conditional synergy index to assess biological interaction. Eur J Epidemiol 24, 485–494 (2009). https://doi.org/10.1007/s10654-009-9378-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-009-9378-z

Keywords

Navigation