A survey on independence-based Markov networks learning

Schlüter, Federico

doi:10.1007/s10462-012-9346-y

A survey on independence-based Markov networks learning

Published: 21 June 2012

Volume 42, pages 1069–1093, (2014)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Federico Schlüter¹

371 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

The problem of learning the Markov network structure from data has become increasingly important in machine learning, and in many other application fields. Markov networks are probabilistic graphical models, a widely used formalism for handling probability distributions in intelligent systems. This document focuses on a technology called independence-based learning, which allows for the learning of the independence structure of Markov networks from data in an efficient and sound manner, whenever the dataset is sufficiently large, and data is a representative sample of the target distribution. In the analysis of such technology, this work surveys the current state-of-the-art algorithms, discussing its limitations, and posing a series of open problems where future work may produce some advances in the area, in terms of quality and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agresti A (2002) Categorical data analysis. 2nd edn. Wiley, New York
Book MATH Google Scholar
Alden M (2007) MARLEDA: effective distribution estimation through Markov Random fields. PhD thesis, Dept of CS, University of Texas Austin
Aliferis C, Tsamardinos I, Statnikov A (2003) HITON, a novel Markov blanket algorithm for optimal variable selection. AMIA Fall
Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010a) Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. JMLR 11: 171–234
MATH MathSciNet Google Scholar
Aliferis C, Statnikov A, Tsamardinos I, Mani S, Koutsoukos X (2010b) Local causal and Markov blanket induction for causal discovery and feature selection for classification part II: analysis and extensions. JMLR 11: 235–284
MATH MathSciNet Google Scholar
Amgoud L, Cayrol C (2002) A reasoning model based on the production of acceptable arguments. Ann Math Artif Intell 34: 197–215
Article MATH MathSciNet Google Scholar
Anguelov D, Taskar B, Chatalbashev V, Koller D, Gupta D, Heitz G, Ng A (2005) Discriminative learning of Markov random fields for segmentation of 3D range data. In: Proceedings of the CVPR
Barahona F (1982) On the computational complexity of Ising spin glass models. J Phys A: Math Gen 15(10): 3241–3253
Article MathSciNet Google Scholar
Besag J (1977) Efficiency of pseudolikelihood estimation for simple Gaussian fields. Biometrica 64: 616–618
Article MATH MathSciNet Google Scholar
Besag J, York J, Mollie A (1991) Bayesian image restoration with two applications in spatial statistics. Ann Inst Stat Math 43: 1–59
Article MATH MathSciNet Google Scholar
Bromberg F (2007) Markov network structure discovery using independence tests. PhD thesis, Dept of CS, Iowa State University
Bromberg F, Margaritis D (2007) Efficient and robust independence-based Markov network structure discovery. In: Proceedings of IJCAI
Bromberg F, Margaritis D (2009) Improving the reliability of causal discovery from small data sets using argumentation. JMLR 10: 301–340
MATH MathSciNet Google Scholar
Bromberg F, Margaritis D, Honavar V (2006) Efficient markov network structure discovery using independence tests. In: Proceedings of the SIAM data mining, p 06
Bromberg F, Margaritis D, Honavar H (2009) Efficient Markov network structure discovery using independence tests. JAIR 35: 449–485
MATH Google Scholar
Cai KK, Bu JJ, Chen C, Qiu G (2007) A novel dependency language model for information retrieval. J Zhejiang Univ Sci A 8: 871–882. doi:10.1631/jzus.2007.A0871
Article MATH Google Scholar
Cochran WG (1954) Some methods of strengthening the common χ tests. Biometrics 10: 417–451
Article MATH MathSciNet Google Scholar
Cooper GF (1990) The computational complexity of probabilistic inference using bayesian belief networks. Artif Intell 42(2–3): 393–405. doi:10.1016/0004-3702(90)90060-D
Article MATH Google Scholar
Cover TM, Thomas JA (1991) Elements of information theory. Wiley-Interscience, New York
Book MATH Google Scholar
Cressie N (1992) Statistics for spatial data. Terra Nova 4(5): 613–617. doi:10.1111/j.1365-3121.1992.tb00605.x
Article Google Scholar
Davis J, Domingos P (2010) Bottom-up learning of Markov network structure. In: ICML, pp 271–278
Della Pietra S, Della Pietra VJ, Lafferty JD (1997) Inducing features of random fields. IEEE Trans PAMI 19(4): 380–393
Article Google Scholar
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. Comput Biol 7: 601–620
Article Google Scholar
Fu S, Desmarais MC (2008) Fast Markov blanket discovery algorithm via local learning within single pass. In: Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on advances in artificial intelligence, Springer, Berlin, Heidelberg, Canadian AI’08, pp 96–107
Fu S, Desmarais MC (2010) Markov blanket based feature selection: a review of past decade. In: Proceedings of the world congress on engineering, vol I, pp 321–328
Ganapathi V, Vickrey D, Duchi J, Koller D (2008) Constrained approximate maximum entropy learning of Markov random fields. In: Uncertainty in artificial intelligence, pp 196–203
Gandhi P, Bromberg F, Margaritis D (2008) Learning Markov network structure using few independence tests. In: SIAM international conference on data mining, pp 680–691
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Tech Rep MSR-TR-94-09, Mach Learn 20(3):197–243
Google Scholar
Höfling H, Tibshirani R (2009) Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J Mach Learn Res 10: 883–906
MATH MathSciNet Google Scholar
Hyvärinen A, Dayan P (2005) Estimation of non-normalized statistical models by score matching. J Mach Learn Res 6: 695–709
MATH MathSciNet Google Scholar
Karyotis V (2010) Markov random fields for malware propagation: the case of chain networks. Commun Lett 14: 875–877
Article Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge
Google Scholar
Koller D, Sahami M (1996) Toward optimal feature selection. Morgan Kaufmann, Los Altos 284–292
Google Scholar
Lam W, Bacchus F (1994) Learning Bayesian belief networks: an approach based on the MDL principle. Comput Intell 10: 269–293
Article Google Scholar
Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms. A new tool for evolutionary computation. Kluwer, Dordrecht
Book MATH Google Scholar
Lauritzen SL (1996) Graphical models. Oxford University Press, Oxford
Google Scholar
Lee SI, Ganapathi V, Koller D (2006) Efficient structure learning of Markov networks using L1-regularization. In: NIPS
Li SZ (2001) Markov random field modeling in image analysis. Springer-Verlag New York, Inc, Secaucus
Book MATH Google Scholar
Margaritis D (2005) Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of AAAI
Margaritis D, Bromberg F (2009) Efficient Markov network discovery using particle filter. Comput Intell 25(4): 367–394
Article MathSciNet Google Scholar
Margaritis D, Thrun S (2000) Bayesian network induction via local neighborhoods. In: Proceedings of NIPS
McCallum A (2003) Efficiently inducing features of conditional random fields. In: Proceedings of uncertainty in artificial intelligence (UAI)
Metzler D, Croft WB (2005) A markov random field model for term dependencies. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, NY, USA, SIGIR ’05, pp 472–479
Minka T (2001) Algorithms for maximum-likelihood logistic regression. Tech. rep., Dept of Statistics, Carnegie Mellon University, Pittsburgh
Minka T (2004) Power EP. Tech. Rep. MSR-TR-2004-149, Microsoft Research, Cambridge
Mooij JM (2010) libDAI: a free and open source C++ library for discrete approximate inference in graphical models. J Mach Learn Res 11: 2169–2173
MATH Google Scholar
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Los Altos
Google Scholar
Pearl J, Paz A (1985) GRAPHOIDS: a graph based logic for reasonning about relevance relations. Tech. Rep. 850038 (R-53-L), cognitive systems laboratory, University of California, Los Angeles
Peña JM, Nilsson R, Björkegren J, Tegnér J (2007) Towards scalable and data efficient learning of Markov boundaries. Int J Approx Reason 45:211–232
Article MATH Google Scholar
Ravikumar P, Wainwright MJ, Lafferty JD (2010) High-dimensional Ising model selection using L1-regularized logistic regression. Ann Stat 38: 1287–1319. doi:10.1214/09-AOS691
Article MATH MathSciNet Google Scholar
Schmidt M, Murphy K, Fung G, Rosales R (2008) Structure learning in random fields for heart motion abnormality detection. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE Conference on, pp 1 –8. doi:10.1109/CVPR.2008.4587367
Shakya S, Santana R (2008) A markovianity based optimization algorithm. Tech. rep., Basque Country U
Shekhar S, Zhang P, Huang Y, Vatsavai RR (2004) Trends in Spatial Data Mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Trends in spatial data mining (Chap. 19). AAAI Press/The MIT Press, Cambridge, pp 357–379
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search, adaptive computation and machine learning series. MIT Press, Cambridge
Google Scholar
Tsamardinos I, Aliferis CF, Statnikov Er (2003) Algorithms for large scale Markov blanket discovery. In: The 16th international FLAIRS conference, St. Augustine, Florida, USA, pp 376–380
Tsamardinos I, Brown L, Aliferis CF (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Mach Learn 65: 31–78
Article Google Scholar
Vishwanathan SVN, Schraudolph NN, Schmidt MW, Murphy KP (2006) Accelerated training of conditional random fields with stochastic gradient methods. In: Proceedings of the 23rd international conference on Machine learning, ACM, New York, NY, USA, ICML ’06, pp 969–976
Wainwright MJ, Jordan MI (2008) Graphical models, exponential families, and variational inference. Found Trends Mach Learn 1: 1–305. doi:10.1561/2200000001
Article MATH Google Scholar
Wainwright MJ, Jaakkola TS, Willsky AS (2003) Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS
Winn J, Bishop CM (2005) Variational message passing. J Mach Learn Res 6: 661–694
MATH MathSciNet Google Scholar
Yaramakala S, Margaritis D (2005) Speculative Markov blanket discovery for optimal feature selection. In: Data mining, fifth IEEE international conference on, 4 pp. doi:10.1109/ICDM.2005.134
Yedidia J, Freeman W, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inform Theory 51(7): 2282–2312. doi:10.1109/TIT.2005.850085
Article MATH MathSciNet Google Scholar
Yedidia JS, Freeman WT, Weiss Y (2004) Constructing free energy approximations and generalized belief propagation algorithms. IEEE Tran Inform Theory 51: 2282–2312
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Lab. DHARMa of Artificial Intelligence, Dept of Information Systems, Facultad Regional Mendoza, National Technological University, Mendoza, Argentina
Federico Schlüter

Authors

Federico Schlüter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Federico Schlüter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlüter, F. A survey on independence-based Markov networks learning. Artif Intell Rev 42, 1069–1093 (2014). https://doi.org/10.1007/s10462-012-9346-y

Download citation

Published: 21 June 2012
Issue Date: December 2014
DOI: https://doi.org/10.1007/s10462-012-9346-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on independence-based Markov networks learning

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey of transfer learning

A survey on semi-supervised learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A survey on independence-based Markov networks learning

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A survey of transfer learning

A survey on semi-supervised learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation