Skip to main content
Log in

A survey on independence-based Markov networks learning

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The problem of learning the Markov network structure from data has become increasingly important in machine learning, and in many other application fields. Markov networks are probabilistic graphical models, a widely used formalism for handling probability distributions in intelligent systems. This document focuses on a technology called independence-based learning, which allows for the learning of the independence structure of Markov networks from data in an efficient and sound manner, whenever the dataset is sufficiently large, and data is a representative sample of the target distribution. In the analysis of such technology, this work surveys the current state-of-the-art algorithms, discussing its limitations, and posing a series of open problems where future work may produce some advances in the area, in terms of quality and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti A (2002) Categorical data analysis. 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Alden M (2007) MARLEDA: effective distribution estimation through Markov Random fields. PhD thesis, Dept of CS, University of Texas Austin

  • Aliferis C, Tsamardinos I, Statnikov A (2003) HITON, a novel Markov blanket algorithm for optimal variable selection. AMIA Fall

  • Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010a) Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. JMLR 11: 171–234

    MATH  MathSciNet  Google Scholar 

  • Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010b) Local causal and Markov blanket induction for causal discovery and feature selection for classification part II: analysis and extensions. JMLR 11: 235–284

    MATH  MathSciNet  Google Scholar 

  • Amgoud L, Cayrol C (2002) A reasoning model based on the production of acceptable arguments. Ann Math Artif Intell 34: 197–215

    Article  MATH  MathSciNet  Google Scholar 

  • Anguelov D, Taskar B, Chatalbashev V, Koller D, Gupta D, Heitz G, Ng A (2005) Discriminative learning of Markov random fields for segmentation of 3D range data. In: Proceedings of the CVPR

  • Barahona F (1982) On the computational complexity of Ising spin glass models. J Phys A: Math Gen 15(10): 3241–3253

    Article  MathSciNet  Google Scholar 

  • Besag J (1977) Efficiency of pseudolikelihood estimation for simple Gaussian fields. Biometrica 64: 616–618

    Article  MATH  MathSciNet  Google Scholar 

  • Besag J, York J, Mollie A (1991) Bayesian image restoration with two applications in spatial statistics. Ann Inst Stat Math 43: 1–59

    Article  MATH  MathSciNet  Google Scholar 

  • Bromberg F (2007) Markov network structure discovery using independence tests. PhD thesis, Dept of CS, Iowa State University

  • Bromberg F, Margaritis D (2007) Efficient and robust independence-based Markov network structure discovery. In: Proceedings of IJCAI

  • Bromberg F, Margaritis D (2009) Improving the reliability of causal discovery from small data sets using argumentation. JMLR 10: 301–340

    MATH  MathSciNet  Google Scholar 

  • Bromberg F, Margaritis D, Honavar V (2006) Efficient markov network structure discovery using independence tests. In: Proceedings of the SIAM data mining, p 06

  • Bromberg F, Margaritis D, Honavar H (2009) Efficient Markov network structure discovery using independence tests. JAIR 35: 449–485

    MATH  Google Scholar 

  • Cai KK, Bu JJ, Chen C, Qiu G (2007) A novel dependency language model for information retrieval. J Zhejiang Univ Sci A 8: 871–882. doi:10.1631/jzus.2007.A0871

    Article  MATH  Google Scholar 

  • Cochran WG (1954) Some methods of strengthening the common χ tests. Biometrics 10: 417–451

    Article  MATH  MathSciNet  Google Scholar 

  • Cooper GF (1990) The computational complexity of probabilistic inference using bayesian belief networks. Artif Intell 42(2–3): 393–405. doi:10.1016/0004-3702(90)90060-D

    Article  MATH  Google Scholar 

  • Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, New York

    Book  MATH  Google Scholar 

  • Cressie N (1992) Statistics for spatial data. Terra Nova 4(5): 613–617. doi:10.1111/j.1365-3121.1992.tb00605.x

    Article  Google Scholar 

  • Davis J, Domingos P (2010) Bottom-up learning of Markov network structure. In: ICML, pp 271–278

  • Della Pietra S, Della Pietra VJ, Lafferty JD (1997) Inducing features of random fields. IEEE Trans PAMI 19(4): 380–393

    Article  Google Scholar 

  • Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. Comput Biol 7: 601–620

    Article  Google Scholar 

  • Fu S, Desmarais MC (2008) Fast Markov blanket discovery algorithm via local learning within single pass. In: Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on advances in artificial intelligence, Springer, Berlin, Heidelberg, Canadian AI’08, pp 96–107

  • Fu S, Desmarais MC (2010) Markov blanket based feature selection: a review of past decade. In: Proceedings of the world congress on engineering, vol I, pp 321–328

  • Ganapathi V, Vickrey D, Duchi J, Koller D (2008) Constrained approximate maximum entropy learning of Markov random fields. In: Uncertainty in artificial intelligence, pp 196–203

  • Gandhi P, Bromberg F, Margaritis D (2008) Learning Markov network structure using few independence tests. In: SIAM international conference on data mining, pp 680–691

  • Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Tech Rep MSR-TR-94-09, Mach Learn 20(3):197–243

    Google Scholar 

  • Höfling H, Tibshirani R (2009) Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J Mach Learn Res 10: 883–906

    MATH  MathSciNet  Google Scholar 

  • Hyvärinen A, Dayan P (2005) Estimation of non-normalized statistical models by score matching. J Mach Learn Res 6: 695–709

    MATH  MathSciNet  Google Scholar 

  • Karyotis V (2010) Markov random fields for malware propagation: the case of chain networks. Commun Lett 14: 875–877

    Article  Google Scholar 

  • Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge

    Google Scholar 

  • Koller D, Sahami M (1996) Toward optimal feature selection. Morgan Kaufmann, Los Altos 284–292

    Google Scholar 

  • Lam W, Bacchus F (1994) Learning Bayesian belief networks: an approach based on the MDL principle. Comput Intell 10: 269–293

    Article  Google Scholar 

  • Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. A new tool for evolutionary computation. Kluwer, Dordrecht

    Book  MATH  Google Scholar 

  • Lauritzen SL (1996) Graphical models. Oxford University Press, Oxford

    Google Scholar 

  • Lee SI, Ganapathi V, Koller D (2006) Efficient structure learning of Markov networks using L1-regularization. In: NIPS

  • Li SZ (2001) Markov random field modeling in image analysis. Springer-Verlag New York, Inc, Secaucus

    Book  MATH  Google Scholar 

  • Margaritis D (2005) Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of AAAI

  • Margaritis D, Bromberg F (2009) Efficient Markov network discovery using particle filter. Comput Intell 25(4): 367–394

    Article  MathSciNet  Google Scholar 

  • Margaritis D, Thrun S (2000) Bayesian network induction via local neighborhoods. In: Proceedings of NIPS

  • McCallum A (2003) Efficiently inducing features of conditional random fields. In: Proceedings of uncertainty in artificial intelligence (UAI)

  • Metzler D, Croft WB (2005) A markov random field model for term dependencies. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY, USA, SIGIR ’05, pp 472–479

  • Minka T (2001) Algorithms for maximum-likelihood logistic regression. Tech. rep., Dept of Statistics, Carnegie Mellon University, Pittsburgh

  • Minka T (2004) Power EP. Tech. Rep. MSR-TR-2004-149, Microsoft Research, Cambridge

  • Mooij JM (2010) libDAI: a free and open source C++ library for discrete approximate inference in graphical models. J Mach Learn Res 11: 2169–2173

    MATH  Google Scholar 

  • Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Los Altos

    Google Scholar 

  • Pearl J, Paz A (1985) GRAPHOIDS: a graph based logic for reasonning about relevance relations. Tech. Rep. 850038 (R-53-L), cognitive systems laboratory, University of California, Los Angeles

  • Peña JM, Nilsson R, Björkegren J, Tegnér J (2007) Towards scalable and data efficient learning of Markov boundaries. Int J Approx Reason 45:211–232

    Article  MATH  Google Scholar 

  • Ravikumar P, Wainwright MJ, Lafferty JD (2010) High-dimensional Ising model selection using L1-regularized logistic regression. Ann Stat 38: 1287–1319. doi:10.1214/09-AOS691

    Article  MATH  MathSciNet  Google Scholar 

  • Schmidt M, Murphy K, Fung G, Rosales R (2008) Structure learning in random fields for heart motion abnormality detection. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE Conference on, pp 1 –8. doi:10.1109/CVPR.2008.4587367

  • Shakya S, Santana R (2008) A markovianity based optimization algorithm. Tech. rep., Basque Country U

  • Shekhar S, Zhang P, Huang Y, Vatsavai RR (2004) Trends in Spatial Data Mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Trends in spatial data mining (Chap. 19). AAAI Press/The MIT Press, Cambridge, pp 357–379

  • Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search, adaptive computation and machine learning series. MIT Press, Cambridge

    Google Scholar 

  • Tsamardinos I, Aliferis CF, Statnikov Er (2003) Algorithms for large scale Markov blanket discovery. In: The 16th international FLAIRS conference, St. Augustine, Florida, USA, pp 376–380

  • Tsamardinos I, Brown L, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65: 31–78

    Article  Google Scholar 

  • Vishwanathan SVN, Schraudolph NN, Schmidt MW, Murphy KP (2006) Accelerated training of conditional random fields with stochastic gradient methods. In: Proceedings of the 23rd international conference on Machine learning, ACM, New York, NY, USA, ICML ’06, pp 969–976

  • Wainwright MJ, Jordan MI (2008) Graphical models, exponential families, and variational inference. Found Trends Mach Learn 1: 1–305. doi:10.1561/2200000001

    Article  MATH  Google Scholar 

  • Wainwright MJ, Jaakkola TS, Willsky AS (2003) Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS

  • Winn J, Bishop CM (2005) Variational message passing. J Mach Learn Res 6: 661–694

    MATH  MathSciNet  Google Scholar 

  • Yaramakala S, Margaritis D (2005) Speculative Markov blanket discovery for optimal feature selection. In: Data mining, fifth IEEE international conference on, 4 pp. doi:10.1109/ICDM.2005.134

  • Yedidia J, Freeman W, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inform Theory 51(7): 2282–2312. doi:10.1109/TIT.2005.850085

    Article  MATH  MathSciNet  Google Scholar 

  • Yedidia JS, Freeman WT, Weiss Y (2004) Constructing free energy approximations and generalized belief propagation algorithms. IEEE Tran Inform Theory 51: 2282–2312

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Federico Schlüter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlüter, F. A survey on independence-based Markov networks learning. Artif Intell Rev 42, 1069–1093 (2014). https://doi.org/10.1007/s10462-012-9346-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-012-9346-y

Keywords

Navigation