Abstract
The problem of learning the Markov network structure from data has become increasingly important in machine learning, and in many other application fields. Markov networks are probabilistic graphical models, a widely used formalism for handling probability distributions in intelligent systems. This document focuses on a technology called independence-based learning, which allows for the learning of the independence structure of Markov networks from data in an efficient and sound manner, whenever the dataset is sufficiently large, and data is a representative sample of the target distribution. In the analysis of such technology, this work surveys the current state-of-the-art algorithms, discussing its limitations, and posing a series of open problems where future work may produce some advances in the area, in terms of quality and efficiency.
Similar content being viewed by others
References
Agresti A (2002) Categorical data analysis. 2nd edn. Wiley, New York
Alden M (2007) MARLEDA: effective distribution estimation through Markov Random fields. PhD thesis, Dept of CS, University of Texas Austin
Aliferis C, Tsamardinos I, Statnikov A (2003) HITON, a novel Markov blanket algorithm for optimal variable selection. AMIA Fall
Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010a) Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. JMLR 11: 171–234
Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010b) Local causal and Markov blanket induction for causal discovery and feature selection for classification part II: analysis and extensions. JMLR 11: 235–284
Amgoud L, Cayrol C (2002) A reasoning model based on the production of acceptable arguments. Ann Math Artif Intell 34: 197–215
Anguelov D, Taskar B, Chatalbashev V, Koller D, Gupta D, Heitz G, Ng A (2005) Discriminative learning of Markov random fields for segmentation of 3D range data. In: Proceedings of the CVPR
Barahona F (1982) On the computational complexity of Ising spin glass models. J Phys A: Math Gen 15(10): 3241–3253
Besag J (1977) Efficiency of pseudolikelihood estimation for simple Gaussian fields. Biometrica 64: 616–618
Besag J, York J, Mollie A (1991) Bayesian image restoration with two applications in spatial statistics. Ann Inst Stat Math 43: 1–59
Bromberg F (2007) Markov network structure discovery using independence tests. PhD thesis, Dept of CS, Iowa State University
Bromberg F, Margaritis D (2007) Efficient and robust independence-based Markov network structure discovery. In: Proceedings of IJCAI
Bromberg F, Margaritis D (2009) Improving the reliability of causal discovery from small data sets using argumentation. JMLR 10: 301–340
Bromberg F, Margaritis D, Honavar V (2006) Efficient markov network structure discovery using independence tests. In: Proceedings of the SIAM data mining, p 06
Bromberg F, Margaritis D, Honavar H (2009) Efficient Markov network structure discovery using independence tests. JAIR 35: 449–485
Cai KK, Bu JJ, Chen C, Qiu G (2007) A novel dependency language model for information retrieval. J Zhejiang Univ Sci A 8: 871–882. doi:10.1631/jzus.2007.A0871
Cochran WG (1954) Some methods of strengthening the common χ tests. Biometrics 10: 417–451
Cooper GF (1990) The computational complexity of probabilistic inference using bayesian belief networks. Artif Intell 42(2–3): 393–405. doi:10.1016/0004-3702(90)90060-D
Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, New York
Cressie N (1992) Statistics for spatial data. Terra Nova 4(5): 613–617. doi:10.1111/j.1365-3121.1992.tb00605.x
Davis J, Domingos P (2010) Bottom-up learning of Markov network structure. In: ICML, pp 271–278
Della Pietra S, Della Pietra VJ, Lafferty JD (1997) Inducing features of random fields. IEEE Trans PAMI 19(4): 380–393
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. Comput Biol 7: 601–620
Fu S, Desmarais MC (2008) Fast Markov blanket discovery algorithm via local learning within single pass. In: Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on advances in artificial intelligence, Springer, Berlin, Heidelberg, Canadian AI’08, pp 96–107
Fu S, Desmarais MC (2010) Markov blanket based feature selection: a review of past decade. In: Proceedings of the world congress on engineering, vol I, pp 321–328
Ganapathi V, Vickrey D, Duchi J, Koller D (2008) Constrained approximate maximum entropy learning of Markov random fields. In: Uncertainty in artificial intelligence, pp 196–203
Gandhi P, Bromberg F, Margaritis D (2008) Learning Markov network structure using few independence tests. In: SIAM international conference on data mining, pp 680–691
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Tech Rep MSR-TR-94-09, Mach Learn 20(3):197–243
Höfling H, Tibshirani R (2009) Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J Mach Learn Res 10: 883–906
Hyvärinen A, Dayan P (2005) Estimation of non-normalized statistical models by score matching. J Mach Learn Res 6: 695–709
Karyotis V (2010) Markov random fields for malware propagation: the case of chain networks. Commun Lett 14: 875–877
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge
Koller D, Sahami M (1996) Toward optimal feature selection. Morgan Kaufmann, Los Altos 284–292
Lam W, Bacchus F (1994) Learning Bayesian belief networks: an approach based on the MDL principle. Comput Intell 10: 269–293
Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. A new tool for evolutionary computation. Kluwer, Dordrecht
Lauritzen SL (1996) Graphical models. Oxford University Press, Oxford
Lee SI, Ganapathi V, Koller D (2006) Efficient structure learning of Markov networks using L1-regularization. In: NIPS
Li SZ (2001) Markov random field modeling in image analysis. Springer-Verlag New York, Inc, Secaucus
Margaritis D (2005) Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of AAAI
Margaritis D, Bromberg F (2009) Efficient Markov network discovery using particle filter. Comput Intell 25(4): 367–394
Margaritis D, Thrun S (2000) Bayesian network induction via local neighborhoods. In: Proceedings of NIPS
McCallum A (2003) Efficiently inducing features of conditional random fields. In: Proceedings of uncertainty in artificial intelligence (UAI)
Metzler D, Croft WB (2005) A markov random field model for term dependencies. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY, USA, SIGIR ’05, pp 472–479
Minka T (2001) Algorithms for maximum-likelihood logistic regression. Tech. rep., Dept of Statistics, Carnegie Mellon University, Pittsburgh
Minka T (2004) Power EP. Tech. Rep. MSR-TR-2004-149, Microsoft Research, Cambridge
Mooij JM (2010) libDAI: a free and open source C++ library for discrete approximate inference in graphical models. J Mach Learn Res 11: 2169–2173
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Los Altos
Pearl J, Paz A (1985) GRAPHOIDS: a graph based logic for reasonning about relevance relations. Tech. Rep. 850038 (R-53-L), cognitive systems laboratory, University of California, Los Angeles
Peña JM, Nilsson R, Björkegren J, Tegnér J (2007) Towards scalable and data efficient learning of Markov boundaries. Int J Approx Reason 45:211–232
Ravikumar P, Wainwright MJ, Lafferty JD (2010) High-dimensional Ising model selection using L1-regularized logistic regression. Ann Stat 38: 1287–1319. doi:10.1214/09-AOS691
Schmidt M, Murphy K, Fung G, Rosales R (2008) Structure learning in random fields for heart motion abnormality detection. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE Conference on, pp 1 –8. doi:10.1109/CVPR.2008.4587367
Shakya S, Santana R (2008) A markovianity based optimization algorithm. Tech. rep., Basque Country U
Shekhar S, Zhang P, Huang Y, Vatsavai RR (2004) Trends in Spatial Data Mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Trends in spatial data mining (Chap. 19). AAAI Press/The MIT Press, Cambridge, pp 357–379
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search, adaptive computation and machine learning series. MIT Press, Cambridge
Tsamardinos I, Aliferis CF, Statnikov Er (2003) Algorithms for large scale Markov blanket discovery. In: The 16th international FLAIRS conference, St. Augustine, Florida, USA, pp 376–380
Tsamardinos I, Brown L, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65: 31–78
Vishwanathan SVN, Schraudolph NN, Schmidt MW, Murphy KP (2006) Accelerated training of conditional random fields with stochastic gradient methods. In: Proceedings of the 23rd international conference on Machine learning, ACM, New York, NY, USA, ICML ’06, pp 969–976
Wainwright MJ, Jordan MI (2008) Graphical models, exponential families, and variational inference. Found Trends Mach Learn 1: 1–305. doi:10.1561/2200000001
Wainwright MJ, Jaakkola TS, Willsky AS (2003) Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS
Winn J, Bishop CM (2005) Variational message passing. J Mach Learn Res 6: 661–694
Yaramakala S, Margaritis D (2005) Speculative Markov blanket discovery for optimal feature selection. In: Data mining, fifth IEEE international conference on, 4 pp. doi:10.1109/ICDM.2005.134
Yedidia J, Freeman W, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inform Theory 51(7): 2282–2312. doi:10.1109/TIT.2005.850085
Yedidia JS, Freeman WT, Weiss Y (2004) Constructing free energy approximations and generalized belief propagation algorithms. IEEE Tran Inform Theory 51: 2282–2312
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schlüter, F. A survey on independence-based Markov networks learning. Artif Intell Rev 42, 1069–1093 (2014). https://doi.org/10.1007/s10462-012-9346-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-012-9346-y