A Maximum Entropy Approach to Learn Bayesian Networks from Incomplete Data

Corani, Giorgio; de Campos, Cassio P.

doi:10.1007/978-3-319-12454-4_6

Giorgio Corani⁶ &
Cassio P. de Campos^7,8

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 118))

2276 Accesses

Abstract

This chapter addresses the problem of estimating the parameters of a Bayesian network from incomplete data. This is a hard problem, which for computational reasons cannot be effectively tackled by a full Bayesian approach. The work around is to search for the estimate with maximum posterior probability. This is usually done by selecting the highest posterior probability estimate among those found by multiple runs of Expectation-Maximization with distinct starting points. However, many local maxima characterize the posterior probability function, and several of them have similar high probability. We argue that high probability is necessary but not sufficient in order to obtain good estimates. We present an approach based on maximum entropy to address this problem and describe a simple and effective way to implement it. Experiments show that our approach produces significantly better estimates than the most commonly used method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
MCAR (or missing completely at random) indicates that the probability of each value being missing does not depend on the value itself, neither on the value of other variables.

References

Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The alarm monitoring system: a case study with two probabilistic inference techniques for belief networks. In: Proceedings of the 2nd European Conference on Artificial Intelligence. Medicine, vol. 38, pp. 247–256 (1989)
Google Scholar
Cowell, R.G.: Parameter learning from incomplete data for Bayesian networks. In: Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics. Morgan Kaufmann (1999)
Google Scholar
de Campos, C.P., Cozman, F.G.: Inference in credal networks using multilinear programming. In: Proceedings of the 2nd Starting AI Researcher Symposium, pp. 50–61. IOS Press, Valencia (2004)
Google Scholar
de Campos, C.P., Ji, Q.: Improving Bayesian network parameter learning using constraints. In: Proceedings of the 19th International Conference on Pattern Recognition, pp. 1–4. IEEE (2008)
Google Scholar
de Campos, C.P., Zhang, L., Tong, Y., Ji, Q.: Semi-qualitative probabilistic networks in computer vision problems. J. Stat. Theory Pract. 3(1), 197–210 (2009)
Article MATH MathSciNet Google Scholar
de Campos, C.P., Ji, Q.: Bayesian networks and the imprecise Dirichlet model applied to recognition problems. In: W. Liu (ed.) Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Lecture Notes in Computer Science, vol. 6717, pp. 158–169. Springer, Berlin (2011)
Google Scholar
de Campos, C.P., Rancoita, P.M.V., Kwee, I., Zucca, E., Zaffalon, M., Bertoni, F.: Discovering subgroups of patients from DNA copy number data using NMF on compacted matrices. PLoS ONE 8(11), e79,720 (2013)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. Series B 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Good, I.J.: Studies in the history of probability and statistics. XXXVII A. M. Turing’s statistical work in World War II. Biometrika 66, 393–396 (1979)
Article MathSciNet Google Scholar
Heckerman, D.: A tutorial on learning with Bayesian networks. In: Jordan, M. Learning in Graphical Models vol. 89, pp. 301–354. MIT, Cambridge (1998)
Google Scholar
Huang, B., Salleb-Aouissi, A.: Maximum entropy density estimation with incomplete presence-only data. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics: JMLR W&CP 5, pp. 240–247 (2009)
Google Scholar
Jaynes, E.T.: On the rationale of maximum-entropy methods. Proc. IEEE 70(9), 939–952 (1982)
Article Google Scholar
Koller, D., Friedman, N.: Probabilistic Graphical Models. MIT, Cambridge (2009)
MATH Google Scholar
Lauritzen, S.L.: The EM algorithm for graphical association models with missing data. Comput. Stat. Data Anal. 19(2), 191–201 (1995)
Article MATH Google Scholar
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. J. Royal Stat. Soc. Series B 50(2), 157–224 (1988)
MATH MathSciNet Google Scholar
Lukasiewicz, T.: Credal Networks under Maximum Entropy. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, pp. 363–370. Morgan Kaufmann Publishers Inc. (2000)
Google Scholar
McLachlan, G.M., Krishnan, T.: The EM Algorithm and Extensions. Wiley, New York (1997)
MATH Google Scholar
Murphy, K.P.: The Bayes Net Toolbox for MATLAB. In: Comput. Sci. Stat. 33, 331–350 (2001)
Google Scholar
Ramoni, M., Sebastiani, P.: Robust learning with missing data. Mach. Learn. 45(2), 147–170 (2001)
Article MATH Google Scholar
Sherali, H.D., Tuncbilek, C.H.: A global optimization algorithm for polynomial programming problems using a reformulation-linearization technique. J. Global Optim. 2, 101–112 (1992)
Article MATH MathSciNet Google Scholar
Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, New York (1991)
Book MATH Google Scholar
Wang, S., Schuurmans, D., Peng, F., Zhao, Y.: Combining statistical language models via the latent maximum entropy principle. Mach. Learn. 60(1–3), 229–250 (2005)
Article Google Scholar

Download references

Acknowledgements

The research in this paper has been partially supported by the Swiss NSF grant no. 200021_146606/1.

Author information

Authors and Affiliations

Istituto Dalle Molle di studi sull’Intelligenza Artificiale (IDSIA), Scuola universitaria professionale della Svizzera italiana (SUPSI), Università della Svizzera italiana (USI), Manno, Switzerland
Giorgio Corani
Dalle Molle Institute for Artificial Intelligence, Manno, Switzerland
Cassio P. de Campos
Queen’s University, Belfast, UK
Cassio P. de Campos

Authors

Giorgio Corani
View author publications
You can also search for this author in PubMed Google Scholar
Cassio P. de Campos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giorgio Corani .

Editor information

Editors and Affiliations

Federal University of Sao Carlos, Sao Carlos, Brazil
Adriano Polpo
University of Sao Paulo, Sao Carlos, Brazil
Francisco Louzada
Campinas State University, Campinas, Brazil
Laura L. R. Rifo
Dept. of Applied Mathematics, University of Sao Paulo Institute of Mathematics and Statistics, Sao Paulo, São Paulo, Brazil
Julio M. Stern
School of Arts, Sciences and Humanities, University of Sao Paulo, Sao Paulo, São Paulo, Brazil
Marcelo Lauretto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Corani, G., de Campos, C. (2015). A Maximum Entropy Approach to Learn Bayesian Networks from Incomplete Data. In: Polpo, A., Louzada, F., Rifo, L., Stern, J., Lauretto, M. (eds) Interdisciplinary Bayesian Statistics. Springer Proceedings in Mathematics & Statistics, vol 118. Springer, Cham. https://doi.org/10.1007/978-3-319-12454-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-12454-4_6
Published: 26 February 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12453-7
Online ISBN: 978-3-319-12454-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics