Abstract
The fundamental concepts underlying Markov networks are the conditional independence and the set of rules called Markov properties that translate conditional independence constraints into graphs. We introduce the concept of mutual conditional independence in an independent set of a Markov network, and we prove its equivalence to the Markov properties under certain regularity conditions. This extends the notion of similarity between separation in graph and conditional independence in probability to similarity between the mutual separation in graph and the mutual conditional independence in probability. Model selection in graphical models remains a challenging task due to the large search space. We show that mutual conditional independence property can be exploited to reduce the search space. We present a new forward model selection algorithm for graphical log-linear models using mutual conditional independence. We illustrate our algorithm with a real data set example. We show that for sparse models the size of the search space can be reduced from \(\mathcal {O} (n^{3})\) to \(\mathcal {O}(n^{2})\) using our proposed forward selection method rather than the classical forward selection method. We also envision that this property can be leveraged for model selection and inference in different types of graphical models.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agresti, A.: Categorical Data Analysis, 3rd edn. Wiley-Interscience, New York (2012)
Andersen, A.H.: Multidimensional contingency tables. Scand. J. Statist. 3, 115–127 (1974)
Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. The MIT Press, Cambridge (1989)
Christensen, R.: Log-Linear Models and Logistic Regression, 2nd edn. Springer, Berlin (1997)
Dahinden, C., Kalisch, M., Buhlmann, P.: Decomposition and model selection for large contingency tables. Biometrical Journal 52, 233–252 (2010)
Dawid, A.P.: Conditional independence in statistical theory. J. R. Stat. Soc. 41(1), 1–31 (1979)
Gauraha, N.: Graphical log-linear models: fundamental concepts and applications. Journal of Modern Applied Statistical Methods 16(1), 545–577 (2017)
Geiger, D., Pearl, J.: Logical and algorithmic properties of conditional independence and graphical models. Ann. Stat. 24(4), 2001–2021 (1993)
Goodman, L.A.: The analysis of multidimensional contingency tables: stepwise procedures and direct estimation methods for building models for multiple classifications. Technometrics 13, 31–66 (1971)
Jordan, M.I.: Graphical models. Stat. Sci. 19(1), 140–155 (2004)
Lauritzen, S.L.: Graphical Models, 2nd edn. Oxford University Press Inc., New York (1996)
Matus, F.: On equivalence of markov properties over undirected graphs. J. Appl. Probability 29, 745–749 (1992)
Matus, F.: On conditional independence and log-convexity. Annales de l’I.H.P. Probabilités et statistiques 44(4), 1137–1147 (2012)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)
Preston, C.: Random Fields. Springer, Berlin (1974)
Reinis, Z., Pokorny, J., Basika, V., Tiserova, J., Gorican, K., Horakova, D., Stuchlikova, E., Havranek, T., Hrabovsky, F.: Prognostic significance of the risk profile in the prevention of coronary heart disease. Bratis. Lek. Listy 76, 137–150 (1981)
Spitzer, F.: Random fields and interacting particle systems. M.A.A.Summer Seminar Notes (1971)
Webb, G.I., Petitjean, F.: A multiple test correction for streams and cascades of statistical hypothesis tests. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1255–1264. ACM (2016)
Wermuth, N.: Model search among multiplicative models. Biometrics 32, 253–263 (1976)
Whittaker, J.: Graphical Models in Applied Multivariate Statistics, 2nd edn. Wiley, Chichester (1990)
Acknowledgements
The authors are grateful for the constructive inputs given by anonymous reviewers and the editor, which helped us to improve the manuscript.
Funding
Open access funding provided by Royal Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Illustration of the proposed forward model selection algorithm
Appendix: Illustration of the proposed forward model selection algorithm
We begin by fitting the complete independence model. The vertices A, B, C, D, E and F correspond to the factors “smoke”, “mental”, “phys”, “systol”, “protein” and “family” respectively. Under assumption of the complete independence model, we get the closed form expression for the estimates of the expected cell counts as follows.
For details on the derivation of the above expression, we refer to the book [3]. After having computed the (estimates of) expected cell counts, the G2 statistic for this model is computed using (12), and we get the G2 = 843.957 (df : 57, p-value : 0). Since observed value of the test statistic is unexpectedly large, hence we conclude that the data fails to support the complete independence model. The data structures are initialized as follows.
As mentioned before, at each step we add the most significant edge as long as the significance level is below a cut-off value (we use cut-off of α = 0.05). As a first step we compare all the models with a single edge added to the model of complete independence. We use the G2 statistics for comparing models (12). Table 4 gives the model fit and Table 5 summarizes the test results.
The model with edge (B, C) has the largest difference in G2 (or the smallest p-value), we choose this models as the current model. Also the set containing the factors (B, C) gets separated as follows.
Before moving to the next step we make sure that the list tempAMIS is irreducible and contains no duplicate.
As a next step, we first check that if MCIP holds for the members of tempAMIS. The MCI test for the elements of tempAMIS {A, B, D, E, F} and {A, D, C, E, F} gives the G2 statistics as 113.566 (df : 52, p-value : 0) and 125.16 (df : 52, p-value : 0) respectively. Since observed value of the test statistic is unexpectedly large, hence MCI does not hold for {A, B, D, E, F} and {A, C, D, E, F}.
We consider the set {A, B, D, E, F}. We compare the current model with all the models with an additional edge from the set {A, B, D, E, F}. It is described in the Tables 6 and 7.
The term (B, E) is added to the current model. The data structures are updated as follows.
There is no duplicate, but irreducibility check suggests that {A, D, E, F}⊂{A, C, D, E, F}, the set {A, D, E, F} is redundant and hence it is removed from the list tempAMIS. Accordingly the The data structures gets updated as follows.
As a next step, we first check if MCIP holds for the set {A, B, D, F}. The MCI test for this set gives the G2 statistic as 67.5 (df : 44, p-value : 0.013). Since observed value of the test statistic is unexpectedly large, hence the data does not support the assumption of A⊥⊥B⊥⊥D⊥⊥F|(C, E). Note that we have already verified that MCIP does not hold for {A, C, D, E, F} in the previous step.
Now, we consider an additional edge from the set {A, C, D, E, F}, Table 8 gives the model fit and Table 9 summarizes the test results.
The term (A, C) is added to the current model. Accordingly the modified data structure is given below.
There is no duplicate and tempAMIS can not be reduced further. So, we move on to the next step. The G2 statistics 67.500 (df : 44, p-value : 0.0129), 66.025 (df : 44, p-value : 0.0174)and 93.066 (df : 44, p-value : 0) of the MCI tests for the sets {A, B, D, F}, {A, D, E, F} and {C, D, E, F} respectively indicates that the data does not supports the MCI relations for the sets. We consider adding an additional edge from the set {A, B, D, F} to the current model, the details are given in the Tables 10 and 11.
The model with edge (A, D) has the largest difference in G2, so we add the edge (A, D) to the current model. The data structures get updated as follows (It must be noted that since {D, E, F}⊂{C, D, E, F}, the set {D, E, F} is redundant and it is removed from tempAMIS).
We perform the MCI test for the sets {A, B, F}, {B, D, F}} and {A, E, F}. The G2 statistics, 38.915 (df : 32, p-value : 0.1865), 39.271 (df : 32, p-value : 0.1762 and 59.043 (df : 32, p-value : 0.0025) indicate that the MCI holds for the sets {A, B, F} and {B, D, F}, and data fails to support the MCI for the set {A, E, F}. The sets {A, B, F} and {B, D, F} are moved from tempAMIS to AMIS.
Now, we look for the most significant edge in the set {C, D, E, F}. Test details are given in Tables 12 and 13.
We select the model with an additional edge (D, E). The set {C, D, E, F} gets divided into the subsets {C, D, F} and {C, E, F}.
We perform the MCI test for {C, D, F} and {C, E, F}, the G2 statistic is 35.475 (df : 32,p-value : 0.3077) and 42.534 (df : 32,p-value : 0.1009), we conclude that the data supports the MCI relation for the sets. The sets are removed from the tempAMIS and added to the AMIS. We get the intermediate data structures as
We consider the set {A, E, F}, there is only one candidate edge (A, E) from the set to be considered, as (A, F) ⊂{A, B, F} and {E, F}⊂{C, E, F}. In the following, from the model comparison test as given in Tables 14 and 15, we find that the test statistic is unexpectedly large, hence the data goes in favour of the larger model with an additional edge [AE].
Finally, the data structures get updated as follows.
As the list tempAMIS becomes empty, we stop with the model [AC][ADE] [BC][BE][F]. The algorithm returns the {{A, D, F}, {C, E, F}, {B, D, F}{B, E, F}} AMIS. A graph structure can be determined uniquely from the AMIS as given in Fig. 3.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gauraha, N., Parui, S.K. Mutual conditional independence and its applications to model selection in Markov networks. Ann Math Artif Intell 88, 951–972 (2020). https://doi.org/10.1007/s10472-020-09690-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-020-09690-7
Keywords
- Markov networks
- Mutual conditional independence
- Graphical models
- Graphical log-linear models
- Forward model selection