Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification

  • G. Corani
  • A. Antonucci
  • M. Zaffalon
Part of the Intelligent Systems Reference Library book series (ISRL, volume 23)

Abstract

Bayesian networks are powerful probabilistic graphical models for modelling uncertainty. Among others, classification represents an important application: some of the most used classifiers are based on Bayesian networks. Bayesian networks are precise models: exact numeric values should be provided for quantification. This requirement is sometimes too narrow. Sets instead of single distributions can provide a more realistic description in these cases. Bayesian networks can be generalized to cope with sets of distributions. This leads to a novel class of imprecise probabilistic graphical models, called credal networks. In particular, classifiers based on Bayesian networks are generalized to so-called credal classifiers. Unlike Bayesian classifiers, which always detect a single class as the one maximizing the posterior class probability, a credal classifier may eventually be unable to discriminate a single class. In other words, if the available information is not sufficient, credal classifiers allow for indecision between two or more classes, this providing a less informative but more robust conclusion than Bayesian classifiers.

Keywords

Credal sets credal networks Bayesian networks classification credal classifiers naive Bayes classifier naive credal classifier tree-augmented naive Bayes classifier tree-augmented naive credal classifier 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abellán, J., Moral, S.: Upper entropy of credal sets. Applications to credal classification. International Journal of Approximate Reasoning 39(2-3), 235–255 (2005)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Antonucci, A., Brühlmann, R., Piatti, A., Zaffalon, M.: Credal networks for military identification problems. International Journal of Approximate Reasoning 50(4), 666–679 (2009)CrossRefGoogle Scholar
  3. 3.
    Antonucci, A., Cuzzolin, F.: Credal sets approximation by lower probabilities: Application to credal networks. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 716–725. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Antonucci, A., Piatti, A., Zaffalon, M.: Credal networks for operational risk measurement and management. In: Apolloni, B., Howlett, R.J., Jain, L. (eds.) KES 2007, Part II. LNCS (LNAI), vol. 4693, pp. 604–611. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Antonucci, A., Salvetti, A., Zaffalon, M.: Credal networks for hazard assessment of debris flows. In: Kropp, J., Scheffran, J. (eds.) Advanced Methods for Decision Making and Risk Management in Sustainability Science. Nova Science Publishers, New York (2007)Google Scholar
  6. 6.
    Antonucci, A., Sun, Y., de Campos, C., Zaffalon, M.: Generalized loopy 2U: a new algorithm for approximate inference in credal networks. International Journal of Approximate Reasoning 51(5), 474–484 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Antonucci, A., Zaffalon, M.: Equivalence between Bayesian and credal nets on an updating problem. In: Lawry, J., Miranda, E., Bugarin, A., Li, S., Gil, M.A., Grzegorzewski, P., Hryniewicz, O. (eds.) Proceedings of Third International Conference on Soft Methods in Probability and Statistics (SMPS 2006), pp. 223–230. Springer, Heidelberg (2006)Google Scholar
  8. 8.
    Antonucci, A., Zaffalon, M.: Decision-theoretic specification of credal networks: A unified language for uncertain modeling with sets of bayesian networks. International Journal of Approximate Reasoning 49(2), 345–361 (2008)MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Avis, D., Fukuda, K.: Reverse search for enumeration. Discrete Applied Mathematics 65, 21–46 (1996)MathSciNetMATHCrossRefGoogle Scholar
  10. 10.
    Benavoli, A., de Campos, C.P.: Inference from multinomial data based on a MLE-dominance criterion. In: Proc. on European Conf. on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (Ecsqaru), Verona, pp. 22–33 (2009)Google Scholar
  11. 11.
    Benavoli, A., Zaffalon, M., Miranda, E.: Reliable hidden Markov model filtering through coherent lower previsions. In: Proc. 12th Int. Conf. Information Fusion, Seattle (USA), pp. 1743–1750 (2009)Google Scholar
  12. 12.
    Bontempi, G., Birattari, M., Bersini, H.: Lazy learning for local modelling and control design. International Journal of Control 72(7), 643–658 (1999)MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    de Campos, C.P., Cozman, F.G.: Inference in credal networks through integer programming. In: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications. Action M Agency, Prague (2007)Google Scholar
  14. 14.
    Campos, L., Huete, J., Moral, S.: Probability intervals: a tool for uncertain reasoning. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 2(2), 167–196 (1994)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Cano, A., Cano, J., Moral, S.: Convex sets of probabilities propagation by simulated annealing on a tree of cliques. In: Bouchon-Meunier, B., Yager, R.R., Zadeh, L.A. (eds.) IPMU 1994. LNCS, vol. 945, pp. 4–8. Springer, Heidelberg (1995)Google Scholar
  16. 16.
    Cano, A., Gómez-Olmedo, M., Moral, S.: Credal nets with probabilities estimated with an extreme imprecise Dirichlet model. In: de Cooman, G., Vejnarová, I., Zaffalon, M. (eds.) Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications (ISIPTA 2007), pp. 57–66. Action M Agency, Prague (2007)Google Scholar
  17. 17.
    Cano, A., Moral, S.: A review of propagation algorithms for imprecise probabilities. In: [38], pp. 51–60 (1999)Google Scholar
  18. 18.
    Cano, A., Moral, S.: Using probability trees to compute marginals with imprecise probabilities. International Journal of Approximate Reasoning 29(1), 1–46 (2002)MathSciNetMATHCrossRefGoogle Scholar
  19. 19.
    Clyde, M., George, E.: Model uncertainty. Statistical Science, pp. 81–94 (2004)Google Scholar
  20. 20.
    Coolen, F.P.A., Augustin, T.: Learning from multinomial data: a nonparametric predictive alternative to the Imprecise Dirichlet Model. In: ISIPTA 2005: Proceedings of the Fourth International Symposium on Imprecise Probabilities and their Applications, pp. 125–134 (2005)Google Scholar
  21. 21.
    de Cooman, G., Hermans, F., Antonucci, A., Zaffalon, M.: Epistemic irrelevance in credal networks: the case of imprecise markov trees. International Journal of Approximate Reasoning (accepted for publication)Google Scholar
  22. 22.
    de Cooman, G., Miranda, E., Zaffalon, M.: Independent natural extension. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. LNCS, vol. 6178, pp. 737–746. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence 42, 393–405 (1990)MathSciNetMATHCrossRefGoogle Scholar
  24. 24.
    Corani, G., Benavoli, A.: Restricting the IDM for classification. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. CCIS, vol. 80, pp. 328–337. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  25. 25.
    Corani, G., de Campos, C.P.: A tree-augmented classifier based on Extreme Imprecise Dirichlet Model. International Journal of Approximate Reasoning (accepted for publication)Google Scholar
  26. 26.
    Corani, G., Giusti, A., Migliore, D.: Robust texture recognition using imprecise classification. Under ReviewGoogle Scholar
  27. 27.
    Corani, G., Zaffalon, M.: Credal model averaging: An extension of bayesian model averaging to imprecise probabilities. In: Proc. of the 2008 European Conf. on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD 2008), pp. 257–271. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  28. 28.
    Corani, G., Zaffalon, M.: JNCC2: The Java implementation of naive credal classifier 2. Journal of Machine Learning Research 9, 2695–2698 (2008)MathSciNetGoogle Scholar
  29. 29.
    Corani, G., Zaffalon, M.: Learning reliable classifiers from small or incomplete data sets: the naive credal classifier 2. Journal of Machine Learning Research 9, 581–621 (2008)MathSciNetGoogle Scholar
  30. 30.
    Corani, G., Zaffalon, M.: Lazy naive credal classifier. In: Proc. of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data, pp. 30–37. ACM, New York (2009)CrossRefGoogle Scholar
  31. 31.
    Costa, J.E., Fleisher, P.J. (eds.): Physical geomorphology of debris flows, ch. 9, pp. 268–317. Springer, Berlin (1984)Google Scholar
  32. 32.
    Couso, I., Moral, S., Walley, P.: Examples of independence for imprecise probabilities. In: ISIPTA, pp. 121–130 (1999)Google Scholar
  33. 33.
    del Coz, J., Dıez, J., Bahamonde, A.: Learning Nondeterministic Classifiers. Journal of Machine Learning Research 10, 2273–2293 (2009)Google Scholar
  34. 34.
    Cozman, F.G.: Robustness analysis of Bayesian networks with finitely generated convex-sets of distributions. Tech. Rep. CMU-RI-TR 96-41, Robotics Institute, Carnegie Mellon University (1996)Google Scholar
  35. 35.
    Cozman, F.G.: Credal networks. Artificial Intelligence 120, 199–233 (2000)MathSciNetMATHCrossRefGoogle Scholar
  36. 36.
    Dash, D., Cooper, G.: Model averaging for prediction with discrete Bayesian networks. Journal of Machine Learning Research 5, 1177–1203 (2004)MathSciNetMATHGoogle Scholar
  37. 37.
    de Campos, C.P., Cozman, F.G.: The inferential complexity of Bayesian and credal networks. In: Proceedings of the International Joint Conference on Artificial Intelligence, Edinburgh, pp. 1313–1318 (2005)Google Scholar
  38. 38.
    de Cooman, G., Cozman, F.G., Moral, S., Walley, P.: ISIPTA 1999: Proceedings of the First International Symposium on Imprecise Probabilities and Their Applications. The Imprecise Probability Project, Universiteit Gent, Belgium (1999)Google Scholar
  39. 39.
    de Cooman, G., Zaffalon, M.: Updating beliefs with incomplete observations. Artificial Intelligence 159, 75–125 (2004)MathSciNetMATHCrossRefGoogle Scholar
  40. 40.
    Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29(2/3), 103–130 (1997)MATHCrossRefGoogle Scholar
  41. 41.
    Elkan, C.: Magical thinking in data mining: lessons from CoIL challenge 2000. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 426–431. ACM, New York (2001)CrossRefGoogle Scholar
  42. 42.
    Fayyad, U.M., Irani, K.B.: Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In: Proc. of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Francisco (1993)Google Scholar
  43. 43.
    de Finetti, B.: Theory of Probability. Wiley, New York (1974); Two volumes translated from Teoria Delle probabilità, published 1970. The second volume appeared under the same title in 1975MATHGoogle Scholar
  44. 44.
    Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)MathSciNetMATHGoogle Scholar
  45. 45.
    Frank, E., Hall, M., Pfahringer, B.: Locally weighted naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256 (2003)Google Scholar
  46. 46.
    Friedman, J.: On bias, variance, 0/1 - loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1, 55–77 (1997)CrossRefGoogle Scholar
  47. 47.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2), 131–163 (1997)MATHCrossRefGoogle Scholar
  48. 48.
    Griffiths, P.G., Webb, R.H., Melis, T.S.: Frequency and initiation of debris flows in grand canyon, arizona. Journal of Geophysical Research 109, 4002–4015 (2004)CrossRefGoogle Scholar
  49. 49.
    Ha, V., Doan, A., Vu, V., Haddawy, P.: Geometric foundations for interval-based probabilities. Annals of Mathematics and Artificial Intelligence 24(1-4), 1–21 (1998)MathSciNetMATHCrossRefGoogle Scholar
  50. 50.
    Hand, D., Yu, K.: Idiot’s Bayes-Not So Stupid After All? International Statistical Review 69(3), 385–398 (2001)MATHCrossRefGoogle Scholar
  51. 51.
    Hoare, Z.: Landscapes of naive Bayes classifiers. Pattern Analysis & Applications 11(1), 59–72 (2008)MathSciNetCrossRefGoogle Scholar
  52. 52.
    Hoeting, J., Madigan, D., Raftery, A., Volinsky, C.: Bayesian model averaging: A tutorial. Statistical Science 14(4), 382–401 (1999)MathSciNetMATHCrossRefGoogle Scholar
  53. 53.
    Ide, J.S., Cozman, F.G.: IPE and L2U: Approximate algorithms for credal networks. In: Proceedings of the Second Starting AI Researcher Symposium, pp. 118–127. IOS Press, Amsterdam (2004)Google Scholar
  54. 54.
    Jaeger, M.: Ignorability for categorical data. Annals of Statistics, 1964–1981 (2005)Google Scholar
  55. 55.
    Kohavi, R.: Scaling up the accuracy of naive-Bayes classifiers: A decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202–207. AAAI Press, Menlo Park (1996)Google Scholar
  56. 56.
    Kohavi, R., Becker, B., Sommerfield, D.: Improving simple Bayes. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 78–87. Springer, Heidelberg (1997)Google Scholar
  57. 57.
    Levi, I.: The Enterprise of Knowledge. MIT Press, London (1980)Google Scholar
  58. 58.
    Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)MATHGoogle Scholar
  59. 59.
    Madden, M.: On the classification performance of TAN and general Bayesian networks. Knowledge-Based Systems 22(7), 489–495 (2009)CrossRefGoogle Scholar
  60. 60.
    Manski, C.F.: Partial Identification of Probability Distributions. Springer, New York (2003)MATHGoogle Scholar
  61. 61.
    Murphy, K., Weiss, Y., Jordan, M.: Loopy belief propagation for te inference: An empirical study. In: Conference on Uncertainty in Artificial Intelligence, pp. 467–475. Morgan Kaufmann, San Francisco (1999)Google Scholar
  62. 62.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 701–706 (2002)Google Scholar
  63. 63.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)Google Scholar
  64. 64.
    Piatti, A., Antonucci, A., Zaffalon, M.: Building knowledge-based systems by credal networks: a tutorial. In: Baswell, A.R. (ed.) Advances in Mathematics Research, vol. 11, Nova Science Publishers, New York (2010)Google Scholar
  65. 65.
    Ramoni, M., Sebastiani, P.: Robust learning with missing data. Machine Learning 45(2), 147–170 (2001)MATHCrossRefGoogle Scholar
  66. 66.
    Ferreira da Rocha, J.C., Cozman, F.G.: Inference with separately specified sets of probabilities in credal networks. In: Darwiche, A., Friedman, N. (eds.) Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI 2002), pp. 430–437. Morgan Kaufmann, San Francisco (2002)Google Scholar
  67. 67.
    Ferreira da Rocha, J.C., Cozman, F.G.: Inference in credal networks with branch-and-bound algorithms. In: Bernard, J.M., Seidenfeld, T., Zaffalon, M. (eds.) ISIPTA Proceedings in Informatics, vol. 18, pp. 480–493. Carleton Scientific (2003)Google Scholar
  68. 68.
    da Rocha, J.C., Cozman, F.G., de Campos, C.P.: Inference in polytrees with sets of probabilities. In: Conference on Uncertainty in Artificial Intelligence, pp. 217–224. Acapulco (2003)Google Scholar
  69. 69.
    Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)MathSciNetMATHCrossRefGoogle Scholar
  70. 70.
    Takahashi, T.: Debris Flow. iAHR Monograph. A.A. Balkama, Rotterdam (1991)Google Scholar
  71. 71.
    Tessem, B.: Interval probability propagation. International Journal of Approximate Reasoning 7(3), 95–120 (1992)MathSciNetMATHCrossRefGoogle Scholar
  72. 72.
    Troffaes, M.: Decision making with imprecise probabilities: A short review. In: Cozman, F.G. (ed.) SIPTA Newsletter. Society for Imprecise Probability Theory and Applications, Manno, Switzerland, pp. 4–7 (December 2004)Google Scholar
  73. 73.
    Tsoumakas, G., Vlahavas, I.: Random k-Labelsets: An Ensemble Method for Multilabel Classification. In: Proceedings of the 18th European Conference on Machine Learning, pp. 406–417. Springer, Heidelberg (2007)Google Scholar
  74. 74.
    Van Der Putten, P., Van Someren, M.: A bias-variance analysis of a real world learning problem: The CoIL challenge 2000. Machine Learning 57(1), 177–195 (2004)MATHCrossRefGoogle Scholar
  75. 75.
    Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, New York (1991)MATHGoogle Scholar
  76. 76.
    Walley, P.: Inferences from multinomial data: learning about a bag of marbles. J. R. Statist. Soc. B58(1), 3–57 (1996)MathSciNetGoogle Scholar
  77. 77.
    Walley, P.: Statistical Reasoning with Imprecise Probabilities. Monographs on Statistics and Applied Probability, vol. 42. Chapman and Hall, London (1991)MATHGoogle Scholar
  78. 78.
    Walley, P.: Inferences from multinomial data: Learning about a bag of marbles. Journal of the Royal Statistical Society SeriesB 58(1), 3–34 (1996)MathSciNetGoogle Scholar
  79. 79.
    Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  80. 80.
    Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A., Liu, B., Yu, P., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)CrossRefGoogle Scholar
  81. 81.
    Zaffalon, M.: Statistical inference of the naive credal classifier. In: de Cooman, G., Fine, T.L., Seidenfeld, T. (eds.) ISIPTA 2001: Proceedings of the Second International Symposium on Imprecise Probabilities and Their Applications, pp. 384–393. Shaker, The Netherlands (2001)Google Scholar
  82. 82.
    Zaffalon, M.: Conservative rules for predictive inference with incomplete data. In: Cozman, F.G., Nau, R., Seidenfeld, T. (eds.) Proceedings of the Fourth International Symposium on Imprecise Probabilities and Their Applications (ISIPTA 2005), pp. 406–415. SIPTA (2005)Google Scholar
  83. 83.
    Zaffalon, M.: Credible classification for environmental problems. Environmental Modelling & Software 20(8), 1003–1012 (2005)CrossRefGoogle Scholar
  84. 84.
    Zaffalon, M., Fagiuoli, E.: Tree-based credal networks for classification. Reliable Computing 9(6), 487–509 (2003)MathSciNetMATHCrossRefGoogle Scholar
  85. 85.
    Zaffalon, M., Hutter, M.: Robust inference of trees. Annals of Mathematics and Artificial Intelligence 45(1), 215–239 (2005)MathSciNetCrossRefGoogle Scholar
  86. 86.
    Zaffalon, M., Miranda, E.: Conservative Inference Rule for Uncertain Reasoning under Incompleteness. Journal of Artificial Intelligence Research 34, 757–821 (2009)MathSciNetMATHGoogle Scholar
  87. 87.
    Zaffalon, M., Wesnes, K., Petrini, O.: Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data. Artificial Intelligence in Medicine 29(1-2), 61–79 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • G. Corani
    • 1
  • A. Antonucci
    • 1
  • M. Zaffalon
    • 1
  1. 1.IDSIAMannoSwitzerland

Personalised recommendations