Abstract
In the paper we study the multiple testing problem for which individual hypotheses of interest correspond to conditional independence of the two variables X and Y given each of the several conditioning variables. Approaches to such problems avoiding inflation of probability of spurious rejections are widely studied and applied. Here we introduce a direct approach based on Joint Mutual Information (JMI) statistics which restates the problem as a problem of testing of a single hypothesis. The distribution of the test statistics JMI is established and shown to be well numerically approximated for a single data sample. The corresponding test is studied on artificial data sets and is shown to work promisingly when compared to general purpose multiple testing methods such as Bonferroni or Simes procedures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Stat. Soc. B 57, 289–300 (1995)
Benjamini, Y., Yakutieli, D.: The control of false discovery rate in multiple testing under dependency. Ann. Stat. 29(4), 1165–1188 (2001)
Brown, G., Pocock, A., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13(1), 27–66 (2012)
Buhlmann, P., de Geer, S.: Statistics for High-Dimensional Data. Springer, New York (2006). https://doi.org/10.1007/978-3-642-20192-9
Cover, T.M., Thomas, J.A.: Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, New York (2006)
Duchesne, P., Lafaye de Micheaux, P.: Computing the distribution of quadratic forms: further comparisons between the Liu-Tang-Zhang approximation and exact methods. Comput. Stat. Data Anal. 54, 858–862 (2010)
Dudoit, S., van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer, New York (2009). https://doi.org/10.1007/978-0-387-49317-6
Johnson, A.: MIMIC-III, a freely accessible critical care database. Scientific Data 3, 1–9 (2016)
Kubkowski, M., Łazȩcka, M., Mielniczuk, J.: Distributions of a general reduced-order dependence measure and conditional independence testing. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12143, pp. 692–706. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50436-6_51
Kullback, S.: Information Theory and Statistics. Smith, P. (1978)
Moskvina, V., Schmidt, K.: On multiple-testing correction in genome-wide association studies. Genet. Epidemiol. 32, 1567–573 (2008)
Simes, R.: An improved Bonferroni procedure for multiple tests of significance. Biometrika 73, 751–754 (1986)
Storey, J.: A direct approach to false discovery rates. J. Royal Stat. Soc. B 64(3), 479–498 (2002)
Vergara, J., Estevez, P.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24(1), 175–186 (2014)
Yang, H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. Adv. Neural. Inf. Process. Syst. 12, 687–693 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Łazȩcka, M., Mielniczuk, J. (2021). Multiple Testing of Conditional Independence Hypotheses Using Information-Theoretic Approach. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2021. Lecture Notes in Computer Science(), vol 12898. Springer, Cham. https://doi.org/10.1007/978-3-030-85529-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-85529-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85528-4
Online ISBN: 978-3-030-85529-1
eBook Packages: Computer ScienceComputer Science (R0)