Mutual conditional independence and its applications to model selection in Markov networks

Gauraha, Niharika; Parui, Swapan K.

doi:10.1007/s10472-020-09690-7

Mutual conditional independence and its applications to model selection in Markov networks

Open access
Published: 21 July 2020

Volume 88, pages 951–972, (2020)
Cite this article

Download PDF

You have full access to this open access article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Mutual conditional independence and its applications to model selection in Markov networks

Download PDF

Niharika Gauraha¹ &
Swapan K. Parui²

468 Accesses
1 Citation
Explore all metrics

Abstract

The fundamental concepts underlying Markov networks are the conditional independence and the set of rules called Markov properties that translate conditional independence constraints into graphs. We introduce the concept of mutual conditional independence in an independent set of a Markov network, and we prove its equivalence to the Markov properties under certain regularity conditions. This extends the notion of similarity between separation in graph and conditional independence in probability to similarity between the mutual separation in graph and the mutual conditional independence in probability. Model selection in graphical models remains a challenging task due to the large search space. We show that mutual conditional independence property can be exploited to reduce the search space. We present a new forward model selection algorithm for graphical log-linear models using mutual conditional independence. We illustrate our algorithm with a real data set example. We show that for sparse models the size of the search space can be reduced from $\mathcal {O} (n^{3})$ to $\mathcal {O}(n^{2})$ using our proposed forward selection method rather than the classical forward selection method. We also envision that this property can be leveraged for model selection and inference in different types of graphical models.

Article PDF

Graph Density and Uncertainty of Graphical Model Selection Algorithms

Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints

Implications of Faithfulness in Graphical Models

Article 29 October 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Agresti, A.: Categorical Data Analysis, 3rd edn. Wiley-Interscience, New York (2012)
MATH Google Scholar
Andersen, A.H.: Multidimensional contingency tables. Scand. J. Statist. 3, 115–127 (1974)
MathSciNet MATH Google Scholar
Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. The MIT Press, Cambridge (1989)
MATH Google Scholar
Christensen, R.: Log-Linear Models and Logistic Regression, 2nd edn. Springer, Berlin (1997)
MATH Google Scholar
Dahinden, C., Kalisch, M., Buhlmann, P.: Decomposition and model selection for large contingency tables. Biometrical Journal 52, 233–252 (2010)
MathSciNet MATH Google Scholar
Dawid, A.P.: Conditional independence in statistical theory. J. R. Stat. Soc. 41(1), 1–31 (1979)
MathSciNet MATH Google Scholar
Gauraha, N.: Graphical log-linear models: fundamental concepts and applications. Journal of Modern Applied Statistical Methods 16(1), 545–577 (2017)
Article Google Scholar
Geiger, D., Pearl, J.: Logical and algorithmic properties of conditional independence and graphical models. Ann. Stat. 24(4), 2001–2021 (1993)
Article MathSciNet MATH Google Scholar
Goodman, L.A.: The analysis of multidimensional contingency tables: stepwise procedures and direct estimation methods for building models for multiple classifications. Technometrics 13, 31–66 (1971)
Article MATH Google Scholar
Jordan, M.I.: Graphical models. Stat. Sci. 19(1), 140–155 (2004)
Article MathSciNet MATH Google Scholar
Lauritzen, S.L.: Graphical Models, 2nd edn. Oxford University Press Inc., New York (1996)
MATH Google Scholar
Matus, F.: On equivalence of markov properties over undirected graphs. J. Appl. Probability 29, 745–749 (1992)
Article MathSciNet MATH Google Scholar
Matus, F.: On conditional independence and log-convexity. Annales de l’I.H.P. Probabilités et statistiques 44(4), 1137–1147 (2012)
MathSciNet MATH Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)
MATH Google Scholar
Preston, C.: Random Fields. Springer, Berlin (1974)
MATH Google Scholar
Reinis, Z., Pokorny, J., Basika, V., Tiserova, J., Gorican, K., Horakova, D., Stuchlikova, E., Havranek, T., Hrabovsky, F.: Prognostic significance of the risk profile in the prevention of coronary heart disease. Bratis. Lek. Listy 76, 137–150 (1981)
Google Scholar
Spitzer, F.: Random fields and interacting particle systems. M.A.A.Summer Seminar Notes (1971)
Webb, G.I., Petitjean, F.: A multiple test correction for streams and cascades of statistical hypothesis tests. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1255–1264. ACM (2016)
Wermuth, N.: Model search among multiplicative models. Biometrics 32, 253–263 (1976)
Article MATH Google Scholar
Whittaker, J.: Graphical Models in Applied Multivariate Statistics, 2nd edn. Wiley, Chichester (1990)
MATH Google Scholar

Download references

Acknowledgements

The authors are grateful for the constructive inputs given by anonymous reviewers and the editor, which helped us to improve the manuscript.

Funding

Open access funding provided by Royal Institute of Technology.

Author information

Authors and Affiliations

Division of Computational Science and Technology, KTH Royal Institute of Technology, Stockholm, Sweden
Niharika Gauraha
Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata, India
Swapan K. Parui

Authors

Niharika Gauraha
View author publications
You can also search for this author in PubMed Google Scholar
Swapan K. Parui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Niharika Gauraha.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Illustration of the proposed forward model selection algorithm

We begin by fitting the complete independence model. The vertices A, B, C, D, E and F correspond to the factors “smoke”, “mental”, “phys”, “systol”, “protein” and “family” respectively. Under assumption of the complete independence model, we get the closed form expression for the estimates of the expected cell counts as follows.

$$ \hat{m}_{ijklm} = \frac{n_{i.....} n_{.j....} n_{..k...} n_{...l..} n_{....m.} n_{.....n}}{n_{......}^{5}} $$

For details on the derivation of the above expression, we refer to the book [3]. After having computed the (estimates of) expected cell counts, the G² statistic for this model is computed using (12), and we get the G² = 843.957 (df : 57, p-value : 0). Since observed value of the test statistic is unexpectedly large, hence we conclude that the data fails to support the complete independence model. The data structures are initialized as follows.

$$ \begin{array}{@{}rcl@{}} currModel &=& [A] [B] [C] [D] [E] [F]\\ tempAMIS &=& \{ \{ A,B,C,D,E,F\} \}\\ AMIS &=& \{ \emptyset \} \end{array} $$

As mentioned before, at each step we add the most significant edge as long as the significance level is below a cut-off value (we use cut-off of α = 0.05). As a first step we compare all the models with a single edge added to the model of complete independence. We use the G² statistics for comparing models (12). Table 4 gives the model fit and Table 5 summarizes the test results.

Table 4 Model fitting

Full size table

Table 5 Model comparison

Full size table

The model with edge (B, C) has the largest difference in G² (or the smallest p-value), we choose this models as the current model. Also the set containing the factors (B, C) gets separated as follows.

$$ \begin{array}{@{}rcl@{}} currModel &=& [A] [\mathbf{B}\mathbf{C}] [D] [E] [F]\\ tempAMIS &=& \{ \{ A,\mathbf{B},D,E,F\} , \{ A,\mathbf{C},D,E,F \} \}\\ AMIS &=& \{ \emptyset \} \end{array} $$

Before moving to the next step we make sure that the list tempAMIS is irreducible and contains no duplicate.

As a next step, we first check that if MCIP holds for the members of tempAMIS. The MCI test for the elements of tempAMIS {A, B, D, E, F} and {A, D, C, E, F} gives the G² statistics as 113.566 (df : 52, p-value : 0) and 125.16 (df : 52, p-value : 0) respectively. Since observed value of the test statistic is unexpectedly large, hence MCI does not hold for {A, B, D, E, F} and {A, C, D, E, F}.

We consider the set {A, B, D, E, F}. We compare the current model with all the models with an additional edge from the set {A, B, D, E, F}. It is described in the Tables 6 and 7.

Table 6 Model fitting

Full size table

Table 7 Model comparison

Full size table

The term (B, E) is added to the current model. The data structures are updated as follows.

$$ \begin{array}{@{}rcl@{}} currModel &=& [A][BC][\mathbf{B}\mathbf{E}] [D] [F]\\ tempAMIS &=& \{ \{ A,\mathbf{B},D,F\}, \{ A,D,\mathbf{E},F\} , \{ A,C,D,E,F \} \}\\ AMIS &=& \{ \emptyset \} \end{array} $$

There is no duplicate, but irreducibility check suggests that {A, D, E, F}⊂{A, C, D, E, F}, the set {A, D, E, F} is redundant and hence it is removed from the list tempAMIS. Accordingly the The data structures gets updated as follows.

$$ \begin{array}{@{}rcl@{}} currModel &=& [A][BC][BE] [D] [F]\\ tempAMIS &=& \{ \{ A,B,D,F\}, \{ A,C,D,E,F \} \}\\ AMIS &=& \{ \emptyset \} \end{array} $$

As a next step, we first check if MCIP holds for the set {A, B, D, F}. The MCI test for this set gives the G² statistic as 67.5 (df : 44, p-value : 0.013). Since observed value of the test statistic is unexpectedly large, hence the data does not support the assumption of A⊥⊥B⊥⊥D⊥⊥F|(C, E). Note that we have already verified that MCIP does not hold for {A, C, D, E, F} in the previous step.

Now, we consider an additional edge from the set {A, C, D, E, F}, Table 8 gives the model fit and Table 9 summarizes the test results.

Table 8 Model fitting

Full size table

Table 9 Model comparison

Full size table

The term (A, C) is added to the current model. Accordingly the modified data structure is given below.

$$ \begin{array}{@{}rcl@{}} currModel &=& [\mathbf{A}\mathbf{C}] [BC][BE] [D][F]\\ tempAMIS &=& \{ \{A,B,D,F\}, \{\mathbf{A},D,E, F\} , \{ \mathbf{C},D,E,F\}\} \\ AMIS &=& \{ \emptyset \} \end{array} $$

There is no duplicate and tempAMIS can not be reduced further. So, we move on to the next step. The G² statistics 67.500 (df : 44, p-value : 0.0129), 66.025 (df : 44, p-value : 0.0174)and 93.066 (df : 44, p-value : 0) of the MCI tests for the sets {A, B, D, F}, {A, D, E, F} and {C, D, E, F} respectively indicates that the data does not supports the MCI relations for the sets. We consider adding an additional edge from the set {A, B, D, F} to the current model, the details are given in the Tables 10 and 11.

Table 10 Model fitting

Full size table

Table 11 Model comparison

Full size table

The model with edge (A, D) has the largest difference in G², so we add the edge (A, D) to the current model. The data structures get updated as follows (It must be noted that since {D, E, F}⊂{C, D, E, F}, the set {D, E, F} is redundant and it is removed from tempAMIS).

$$ \begin{array}{@{}rcl@{}} currModel &=& [AC][\mathbf{A}\mathbf{D}][BC][BE][F]\\ tempAMIS &=& \{ \{\mathbf{A}, B,F\} , \{B,\mathbf{D},F\} , \{\mathbf{A}, E,F\} , \{C,D,E,F\} \} \\ AMIS &=& \{ \emptyset \} \end{array} $$

We perform the MCI test for the sets {A, B, F}, {B, D, F}} and {A, E, F}. The G² statistics, 38.915 (df : 32, p-value : 0.1865), 39.271 (df : 32, p-value : 0.1762 and 59.043 (df : 32, p-value : 0.0025) indicate that the MCI holds for the sets {A, B, F} and {B, D, F}, and data fails to support the MCI for the set {A, E, F}. The sets {A, B, F} and {B, D, F} are moved from tempAMIS to AMIS.

$$ \begin{array}{@{}rcl@{}} currModel &=& [AC][AD][BC][BE][F]\\ tempAMIS &=& \{ \{A, E,F\} , \{C,D,E,F\} \} \\ AMIS &=& \{ \ \{A, B,F\} , \{B,D,F\} \} \end{array} $$

Now, we look for the most significant edge in the set {C, D, E, F}. Test details are given in Tables 12 and 13.

Table 12 Model fitting

Full size table

Table 13 Model comparison

Full size table

We select the model with an additional edge (D, E). The set {C, D, E, F} gets divided into the subsets {C, D, F} and {C, E, F}.

$$ \begin{array}{@{}rcl@{}} currModel &=& [AC][AD][BC][BE][\mathbf{D}\mathbf{E}][F]\\ tempAMIS &=& \{ \{A, E,F\} , \{C,\mathbf{D},F\}, \{C,\mathbf{E},F\} \} \\ AMIS &=& \{ \{A,B,F\}, \{B,D,F\} \} \end{array} $$

We perform the MCI test for {C, D, F} and {C, E, F}, the G² statistic is 35.475 (df : 32,p-value : 0.3077) and 42.534 (df : 32,p-value : 0.1009), we conclude that the data supports the MCI relation for the sets. The sets are removed from the tempAMIS and added to the AMIS. We get the intermediate data structures as

$$ \begin{array}{@{}rcl@{}} currModel &=& [AC][AD][BC][BE] [DE][F]\\ tempAMIS &=& \{ \{A, E,F\} \} \\ AMIS &=& \{ \{C,E,F\}, \{C,D,F\} , \{A,B,F\} , \{ B,D,F\} \} \end{array} $$

We consider the set {A, E, F}, there is only one candidate edge (A, E) from the set to be considered, as (A, F) ⊂{A, B, F} and {E, F}⊂{C, E, F}. In the following, from the model comparison test as given in Tables 14 and 15, we find that the test statistic is unexpectedly large, hence the data goes in favour of the larger model with an additional edge [AE].

Table 14 Model fitting

Full size table

Table 15 Model comparison

Full size table

Finally, the data structures get updated as follows.

$$ \begin{array}{@{}rcl@{}} currModel &=& [AC][AD][\mathbf{A}\mathbf{E}][BC][BE][DE][F]\\ &=& [AC][ADE][BC][BE][F]\\ tempAMIS &=& \{ \emptyset \} \\ AMIS &=& \{ \{C,E,F\}, \{C,D,F\} , \{A,B,F\} , \{ B,D,F\} \} \end{array} $$

As the list tempAMIS becomes empty, we stop with the model [AC][ADE] [BC][BE][F]. The algorithm returns the {{A, D, F}, {C, E, F}, {B, D, F}{B, E, F}} AMIS. A graph structure can be determined uniquely from the AMIS as given in Fig. 3.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gauraha, N., Parui, S.K. Mutual conditional independence and its applications to model selection in Markov networks. Ann Math Artif Intell 88, 951–972 (2020). https://doi.org/10.1007/s10472-020-09690-7

Download citation

Published: 21 July 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s10472-020-09690-7

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Mutual conditional independence and its applications to model selection in Markov networks

Abstract

Article PDF

Similar content being viewed by others

Graph Density and Uncertainty of Graphical Model Selection Algorithms

Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints

Implications of Faithfulness in Graphical Models

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Illustration of the proposed forward model selection algorithm

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Mutual conditional independence and its applications to model selection in Markov networks

Abstract

Article PDF

Similar content being viewed by others

Graph Density and Uncertainty of Graphical Model Selection Algorithms

Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints

Implications of Faithfulness in Graphical Models

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: Illustration of the proposed forward model selection algorithm

Appendix: Illustration of the proposed forward model selection algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation