Skip to main content

Bayesian Decision Theory

  • Chapter
  • First Online:
  • 3469 Accesses

Introduction

The subject of this chapter, and the one that follows it, is Bayesian decision theory and its use in multi-sensor data fusion. To make our discussion concrete we shall concentrate on the pattern recognition problem [19] in which an unknown pattern, or object, O is to be assigned to one of K possible classes {c 1, c 2, ... , c K }. In this chapter we shall limit ourselves to using a single logical sensor, or classifier, S. We shall then remove this restriction in Chapt. 14 where we consider multiple classifier systems.

In many applications Bayesian decision theory represents the primary fusion algorithm in a multi-sensor data fusion system. In Table 13.1

We list some of these applications together with their Dasararthy classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andres-Ferrer, J., Juan, A.: Constrained domain maximum likelihood estimation for naive Bayes text classificaion. Patt. Anal. Appl. 13, 189–196 (2010)

    Article  MathSciNet  Google Scholar 

  2. Bennett, P.N.: Using asymmetric distributions to improve text classifier probability estimates. In: SIGIR 2003, Toronto, Canada, July 28–August 1 (2003)

    Google Scholar 

  3. Chow, T.W.S., Huang, D.: Estimating optimal feature subsets using efficient estimation of high-dimensional mutual information. IEEE Trans. Neural Networks 16, 213–224 (2005)

    Article  Google Scholar 

  4. Cunningham, P.: Overfitting and diversity in classification ensembles based on feature selection. Tech. Rept. TCD-CS-2005-17, Dept. Computer Science, Trinity College, Dublin, Ireland (2005)

    Google Scholar 

  5. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  6. Elkan, C.: Boosting and naive Bayes’ learning. Tech Rept CS97-557. University of California, San Diego (September 1997)

    Google Scholar 

  7. Friedman, J.H.: On bias, variance 0/1-loss and the curse of dimensionality. Data Mining Knowledge Discovery 1, 55–77 (1997)

    Article  Google Scholar 

  8. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997)

    Article  MATH  Google Scholar 

  9. Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press (1990)

    Google Scholar 

  10. Guyon, I., Elisseeff, A.: An introduction to to variable and feature selection. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  11. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction, Foundations and Applications. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  12. Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  13. Hoare, Z.: Landscapes of naive Bayes’ classifiers. Patt. Anal. App. 11, 59–72 (2008)

    Article  MathSciNet  Google Scholar 

  14. Hsieh, P.-F., Wang, D.-S., Hsu, C.-W.: A linear feature extraction for multi-class classification problems based on class mean and covariance discriminant information. IEEE Trans. Patt. Anal. Mach. Intell. 28, 223–235 (2006)

    Article  Google Scholar 

  15. Holder, L.B., Rusell, I., Markov, Z., Pipe, A.G., Carse, B.: Current and future trends in feature selection and extraction for classification problems. Int. J. Patt. Recogn. Art Intell. 19, 133–142 (2005)

    Article  Google Scholar 

  16. Hong, S.J., Hosking, J., Natarajan, R.: Multiplicative adjustment of class probability: educating naive Bayes. Research Report RC 22393 (W0204-041), IBM Research Division, T. J. Watson Research Center, Yorktown Heights, NY 10598 (2002)

    Google Scholar 

  17. Huang, H.-J., Hsu, C.-N.: Bayesian classification for data from the same unknown class. IEEE Trans. Sys. Man Cybern. -Part B: Cybern. 32, 137–145 (2003)

    Article  Google Scholar 

  18. Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Art Intell. Med. 31, 91–103 (2004)

    Article  Google Scholar 

  19. Jain, A.K., Duin, R.P.W., Mao, J.: Statistical Pattern Recognition: A Review. IEEE Trans. Patt. Anal. Mach. Intell. 22, 4–37 (2000)

    Article  Google Scholar 

  20. Jain, A., Zongker, D.: Feature Selection: Evaluation, application and small sample performance. IEEE Trans. Patt. Anal. Mach. Intell. 19, 153–158 (1997)

    Article  Google Scholar 

  21. Keogh, E.J., Pazzani, M.J.: Learning the structure of augmented Bayesian classifiers. Int. Art Intell. Tools 11, 587–601 (2002)

    Article  Google Scholar 

  22. Kohavi, R., John, G.: Wrappers for feature subset selection. Art. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  23. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Patt. Anal. Mach. Intell. 20, 226–239 (1998)

    Article  Google Scholar 

  24. Kuncheva, L.I.: Combining Pattern Classifiers. John Wiley and Sons (2004)

    Google Scholar 

  25. Kwak, N., Choi, C.-H.: Input feature selection using mutual information based on Parzen window. IEEE Patt. Anal. Mach. Intell. 24, 1667–1671 (2002)

    Article  Google Scholar 

  26. Lee, C., Choi, E.: Bayes error evaluation of the Gaussian ML classifier. IEEE Trans. Geosci. Rem. Sense 38, 1471–1475 (2000)

    Article  Google Scholar 

  27. Liang, Y., Gong, W., Pan, Y., Li, W.: Generalized relevance weighted LDA. Patt. Recogn. 38, 2217–2219 (2005)

    Article  MATH  Google Scholar 

  28. Liu, B., Yang, Y., Webb, G.I., Boughton, J.: A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 302–313. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  29. Loog, M., Duin, R.P.W.: Linear dimensionality reduction via a heteroscedastic extension of LDA: The Chernoff criterion. IEEE Trans. Patt. Anal. Mach. Intell. 26, 732–739 (2004)

    Article  Google Scholar 

  30. Lorena, A.C., de Carvalho, A.C.P.L., Gama, J.M.P.: A review on the combination of binary classifiers in multiclass problems. Art Intell. Rev. 30, 19–37 (2008)

    Article  Google Scholar 

  31. Loughrey, J., Cunningham, P.: Overfitting in wrapper-based feature subset selection: the harder you try the worse it gets. In: 24th SGAI Int. Conf. Innovative Tech. App. Art Intell., pp. 33–43 (2004); See also Technical Report TCD-CS-2005-17, Department of Computer Science, Trinity College, Dublin, Ireland

    Google Scholar 

  32. Loughrey, J., Cunningham, P.: Using early-stopping to avoid overfitting in wrapper-based feature selection employing stochastic search. Technical Report TCD-CS-2005-37, Department of Computer Science, Trinity College, Dublin, Ireland (2005)

    Google Scholar 

  33. Martin, J.K., Hirschberg, D.S.: Small sample statistics for classification error rates I: error rate measurements. Technical Report No. 96-21, July 2, Department of Information and Computer Science, University of California, Irvine, CA 9297-3425, USA (1996a)

    Google Scholar 

  34. Martin, J.K., Hirschberg, D.S.: Small sample statistics for classification error rates II: confidence intervals and significance tests. Technical Report No. 96-22, July 7, Department of Information and Computer Science, University of California, Irvine, CA 9297-3425, USA (1996b)

    Google Scholar 

  35. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance and min-redundancy. IEEE Trans. Patt. Anal. Mach. Intell. 27, 1226–1238 (2005)

    Article  Google Scholar 

  36. Qin, A.K., Suganthan, P.N., Loog, M.: Uncorrelated heteoscedastic LDA based on the weighted pairwise Chernoff criterion. Patt. Recogn. 38, 613–616 (2005)

    Article  Google Scholar 

  37. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for machine learning. MIT Press, MA (2006)

    MATH  Google Scholar 

  38. Raudys, S.J., Jain, A.K.: Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans. Patt. Anal. Mach. Intell. 13, 252–264 (1991)

    Article  Google Scholar 

  39. Ridgeway, G., Madigan, D., Richardson, T., O’Kane, J.W.: Interpretable boosted naive Bayes classification. In: Proc. 4th Int. Conf. Knowledge Discovery Data Mining, New York, pp. 101–104 (1998)

    Google Scholar 

  40. Rish, I., Hellerstein, J., Jayram, T.S.: An analysis of data characteristics that affect naive Bayes performance. IBM Technical Report RC2, IBM Research Division, Thomas J. Watson Research Center, NY 10598 (2001)

    Google Scholar 

  41. Silverman, B.: Density estimation for statistical data analysis. Chapman-Hall (1986)

    Google Scholar 

  42. Sulzmann, J.-N., Fernkranz, J., Hellermeier, E.: On Pairwise Naive Bayes Classifiers. In: Euro. Conf. Mach. Learn., pp. 371–381 (2007)

    Google Scholar 

  43. Tang, E.K., Suganthan, P.N., Yao, X., Qin, A.K.: Linear dimensionality reduction using relevance weighted LDA. Pattern Recognition 38, 485–493 (2005)

    Article  MATH  Google Scholar 

  44. Thomas, C.S.: Classifying acute abdominal pain by assuming independence: a study using two models constructed directly from data. Technical Report CSM-153, Department of Computing Science and Mathematics, University of Stirling, Stirling, Scotland (1999)

    Google Scholar 

  45. Thomaz, C.E., Gillies, D.F.: “Small sample size”: A methodological problem in Bayes plug-in classifier for image recognition. Technical Report 6/2001, Department of Computing, Imperial College of Science, Technology and Medicine, London (2001)

    Google Scholar 

  46. Thomaz, C.E., Gillies, D.F., Feitosa, R.Q.: A new covariance estimate for Bayesian classifiers in biometric recognition. IEEE Trans. Circuits Sys. Video Tech. 14, 214–223 (2004)

    Article  Google Scholar 

  47. Trunk, G.V.: A problem of dimensionality: a simple example. IEEE Trans. Patt. Anal. Mach. Intell. 1, 306–307 (1979)

    Article  Google Scholar 

  48. Viaene, S., Derrig, R., Dedene, G.: A case study of applying boosting naive Bayes to claim fraud diagnosis. IEEE Trans. Know Data Enging. 16, 612–619 (2004)

    Article  Google Scholar 

  49. Vilalta, R., Rish, I.: A decomposition of Classes Via Clustering to Explain and Improve Naive Bayes. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 444–455. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  50. Webb, A.: Statistical pattern Recognition. Arnold, England (1999)

    MATH  Google Scholar 

  51. Webb, G.I., Boughton, J., Wang, Z.: Not so naive Bayes: aggregating one-dependence estimators. Mach. Learn. 58, 5–24 (2005)

    Article  MATH  Google Scholar 

  52. Xie, Z., Hsu, W., Liu, Z., Li Lee, M.: SNNB: A Selective Neighborhood Based Naïve Bayes for Lazy Learning. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 104–114. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  53. Xie, Z., Zhang, Q.: A study of selective neighbourhood-based naive Bayes for efficient lazy learning. In: Proc. 16th IEEE Int. Conf. Tools Art Intell. ICTAI, Boca Raton, Florida (2004)

    Google Scholar 

  54. Zhang, H.: Exploring conditions for the optimality of naive Bayes. Int. J. Patt. Recog. Art Intell. (2005)

    Google Scholar 

  55. Zheng, Z.: Naive Bayesian Classifier Committee. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 196–207. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  56. Zheng, Z., Webb, G.I.: Lazy learning of Bayesian rules. Mach. Learn. 41, 53–87 (2000)

    Article  Google Scholar 

  57. Zheng, Z., Webb, G.I., Ting, K.M.: Lazy Bayesian rules: a lazy semi-naive Bayesian learning technique competitive to boosting decision trees. In: Proc. ICML 1999. Morgan Kaufmann (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to H. B. Mitchell .

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mitchell, H.B. (2012). Bayesian Decision Theory. In: Data Fusion: Concepts and Ideas. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27222-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27222-6_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27221-9

  • Online ISBN: 978-3-642-27222-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics