Skip to main content

Detection of Conditional Dependence Between Multiple Variables Using Multiinformation

  • 756 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12747)

Abstract

We consider a problem of detecting the conditional dependence between multiple discrete variables. This is a generalization of well-known and widely studied problem of testing the conditional independence between two variables given a third one. The issue is important in various applications. For example, in the context of supervised learning, such test can be used to verify model adequacy of the popular Naive Bayes classifier. In epidemiology, there is a need to verify whether the occurrences of multiple diseases are dependent. However, focusing solely on occurrences of diseases may be misleading, as one has to take into account the confounding variables (such as gender or age) and preferably consider the conditional dependencies between diseases given the confounding variables. To address the aforementioned problem, we propose to use conditional multiinformation (CMI), which is a measure derived from information theory. We prove some new properties of CMI. To account for the uncertainty associated with a given data sample, we propose a formal statistical test of conditional independence based on the empirical version of CMI. The main contribution of the work is determination of the asymptotic distribution of empirical CMI, which leads to construction of the asymptotic test for conditional independence. The asymptotic test is compared with the permutation test and the scaled chi squared test. Simulation experiments indicate that the asymptotic test achieves larger power than the competitive methods thus leading to more frequent detection of conditional dependencies when they occur. We apply the method to detect dependencies in medical data set MIMIC-III.

Keywords

  • Detection of conditional dependence
  • Conditional multiinformation
  • Information theory
  • Weighted chi squared distribution
  • Kullback-Leibler divergence

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-77980-1_51
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-77980-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Notes

  1. 1.

    https://github.com/teisseyrep/cmi.

References

  1. Bellot, A., van der Schaar, M.: Conditional independence testing using generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 32, pp. 2199–2208 (2019)

    Google Scholar 

  2. Berrett, T.B., Wang, Y., Barber, R.F., Samworth, R.J.: The conditional permutation test for independence while controlling for confounders. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 82(1), 175–197 (2020)

    MathSciNet  CrossRef  Google Scholar 

  3. Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data, 1st edn. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-642-20192-9

  4. Candès, E., Fan, Y., Janson, L., Lv, J.: Panning for gold: model-x knockoffs for high-dimensional controlled variable selection. J. Roy. Stat. Soc. B 80, 551–577 (2018)

    MathSciNet  CrossRef  Google Scholar 

  5. Chanda, P., et al.: Ambience: a novel approach and efficient algorithm for identifying informative genetic and environmental associations with complex phenotypes. Genetics 180, 1191–2010 (2008)

    CrossRef  Google Scholar 

  6. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience (2006)

    Google Scholar 

  7. Dawid, A.P.: Conditional independence in statistical theory. J. Roy. Stat. Soc.: Ser. B (Methodol.) 41(1), 1–15 (1979)

    MathSciNet  MATH  Google Scholar 

  8. Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 1–9 (2016)

    CrossRef  Google Scholar 

  9. Kubkowski, M., Mielniczuk, J.: Asymptotic distributions of interaction information. Methodol. Comput. Appl. Probab. 23, 291–315 (2020)

    MathSciNet  CrossRef  Google Scholar 

  10. Kullback, S.: Information Theory and Statistics. Peter Smith (1978)

    Google Scholar 

  11. Li, C., Fan, X.: On nonparametric conditional independence tests for continuous variables. WIREs Comput. Stat. 12, 1–11 (2020)

    MathSciNet  CrossRef  Google Scholar 

  12. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    CrossRef  Google Scholar 

  13. Rowe, T., Troy, D.: The sampling distribution of the total correlation for multivariate gaussian random variables. Entropy 21, 921 (2019)

    MathSciNet  CrossRef  Google Scholar 

  14. Runge, J.: Conditional independence testing based on a nearest neighbour estimator of conditional mutual information. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, PMLR, vol. 84, pp. 938–947 (2018)

    Google Scholar 

  15. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000)

    Google Scholar 

  16. Studený, M.: Asymptotic behaviour of empirical multiinformation. Kybernetika 23, 124–135 (1987)

    MathSciNet  MATH  Google Scholar 

  17. Studený, M., Vejnarová, J.: The multiinformation as a tool for measuring stochastic dependence. In: Learning in Graphical Models, pp. 66–82. MIT Press (1999)

    Google Scholar 

  18. Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov Blanket discovery. In: FLAIRS Conference, pp. 376–381 (2003)

    Google Scholar 

  19. Tsamardinos, I., Borboudakis, G.: Permutation testing improves Bayesian network learning. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6323, pp. 322–337. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15939-8_21

    CrossRef  Google Scholar 

  20. Tsybakov, A.: Introduction to Nonparametric Estimation, 1st edn. Springer, New York (2009). https://doi.org/10.1007/b13794

  21. Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4, 66–82 (1960)

    MathSciNet  CrossRef  Google Scholar 

  22. Zhang, K., Peters, J., Janzing, D., Schölkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, pp. 804–813 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Mielniczuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Mielniczuk, J., Teisseyre, P. (2021). Detection of Conditional Dependence Between Multiple Variables Using Multiinformation. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12747. Springer, Cham. https://doi.org/10.1007/978-3-030-77980-1_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77980-1_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77979-5

  • Online ISBN: 978-3-030-77980-1

  • eBook Packages: Computer ScienceComputer Science (R0)