Introduction
The subject of this chapter, and the one that follows it, is Bayesian decision theory and its use in multi-sensor data fusion. To make our discussion concrete we shall concentrate on the pattern recognition problem [19] in which an unknown pattern, or object, O is to be assigned to one of K possible classes {c 1, c 2, ... , c K }. In this chapter we shall limit ourselves to using a single logical sensor, or classifier, S. We shall then remove this restriction in Chapt. 14 where we consider multiple classifier systems.
In many applications Bayesian decision theory represents the primary fusion algorithm in a multi-sensor data fusion system. In Table 13.1
We list some of these applications together with their Dasararthy classification.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Andres-Ferrer, J., Juan, A.: Constrained domain maximum likelihood estimation for naive Bayes text classificaion. Patt. Anal. Appl. 13, 189–196 (2010)
Bennett, P.N.: Using asymmetric distributions to improve text classifier probability estimates. In: SIGIR 2003, Toronto, Canada, July 28–August 1 (2003)
Chow, T.W.S., Huang, D.: Estimating optimal feature subsets using efficient estimation of high-dimensional mutual information. IEEE Trans. Neural Networks 16, 213–224 (2005)
Cunningham, P.: Overfitting and diversity in classification ensembles based on feature selection. Tech. Rept. TCD-CS-2005-17, Dept. Computer Science, Trinity College, Dublin, Ireland (2005)
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)
Elkan, C.: Boosting and naive Bayes’ learning. Tech Rept CS97-557. University of California, San Diego (September 1997)
Friedman, J.H.: On bias, variance 0/1-loss and the curse of dimensionality. Data Mining Knowledge Discovery 1, 55–77 (1997)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997)
Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Academic Press (1990)
Guyon, I., Elisseeff, A.: An introduction to to variable and feature selection. Mach. Learn. Res. 3, 1157–1182 (2003)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction, Foundations and Applications. Springer, Heidelberg (2006)
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer, Heidelberg (2001)
Hoare, Z.: Landscapes of naive Bayes’ classifiers. Patt. Anal. App. 11, 59–72 (2008)
Hsieh, P.-F., Wang, D.-S., Hsu, C.-W.: A linear feature extraction for multi-class classification problems based on class mean and covariance discriminant information. IEEE Trans. Patt. Anal. Mach. Intell. 28, 223–235 (2006)
Holder, L.B., Rusell, I., Markov, Z., Pipe, A.G., Carse, B.: Current and future trends in feature selection and extraction for classification problems. Int. J. Patt. Recogn. Art Intell. 19, 133–142 (2005)
Hong, S.J., Hosking, J., Natarajan, R.: Multiplicative adjustment of class probability: educating naive Bayes. Research Report RC 22393 (W0204-041), IBM Research Division, T. J. Watson Research Center, Yorktown Heights, NY 10598 (2002)
Huang, H.-J., Hsu, C.-N.: Bayesian classification for data from the same unknown class. IEEE Trans. Sys. Man Cybern. -Part B: Cybern. 32, 137–145 (2003)
Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Art Intell. Med. 31, 91–103 (2004)
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical Pattern Recognition: A Review. IEEE Trans. Patt. Anal. Mach. Intell. 22, 4–37 (2000)
Jain, A., Zongker, D.: Feature Selection: Evaluation, application and small sample performance. IEEE Trans. Patt. Anal. Mach. Intell. 19, 153–158 (1997)
Keogh, E.J., Pazzani, M.J.: Learning the structure of augmented Bayesian classifiers. Int. Art Intell. Tools 11, 587–601 (2002)
Kohavi, R., John, G.: Wrappers for feature subset selection. Art. Intell. 97, 273–324 (1997)
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Patt. Anal. Mach. Intell. 20, 226–239 (1998)
Kuncheva, L.I.: Combining Pattern Classifiers. John Wiley and Sons (2004)
Kwak, N., Choi, C.-H.: Input feature selection using mutual information based on Parzen window. IEEE Patt. Anal. Mach. Intell. 24, 1667–1671 (2002)
Lee, C., Choi, E.: Bayes error evaluation of the Gaussian ML classifier. IEEE Trans. Geosci. Rem. Sense 38, 1471–1475 (2000)
Liang, Y., Gong, W., Pan, Y., Li, W.: Generalized relevance weighted LDA. Patt. Recogn. 38, 2217–2219 (2005)
Liu, B., Yang, Y., Webb, G.I., Boughton, J.: A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 302–313. Springer, Heidelberg (2009)
Loog, M., Duin, R.P.W.: Linear dimensionality reduction via a heteroscedastic extension of LDA: The Chernoff criterion. IEEE Trans. Patt. Anal. Mach. Intell. 26, 732–739 (2004)
Lorena, A.C., de Carvalho, A.C.P.L., Gama, J.M.P.: A review on the combination of binary classifiers in multiclass problems. Art Intell. Rev. 30, 19–37 (2008)
Loughrey, J., Cunningham, P.: Overfitting in wrapper-based feature subset selection: the harder you try the worse it gets. In: 24th SGAI Int. Conf. Innovative Tech. App. Art Intell., pp. 33–43 (2004); See also Technical Report TCD-CS-2005-17, Department of Computer Science, Trinity College, Dublin, Ireland
Loughrey, J., Cunningham, P.: Using early-stopping to avoid overfitting in wrapper-based feature selection employing stochastic search. Technical Report TCD-CS-2005-37, Department of Computer Science, Trinity College, Dublin, Ireland (2005)
Martin, J.K., Hirschberg, D.S.: Small sample statistics for classification error rates I: error rate measurements. Technical Report No. 96-21, July 2, Department of Information and Computer Science, University of California, Irvine, CA 9297-3425, USA (1996a)
Martin, J.K., Hirschberg, D.S.: Small sample statistics for classification error rates II: confidence intervals and significance tests. Technical Report No. 96-22, July 7, Department of Information and Computer Science, University of California, Irvine, CA 9297-3425, USA (1996b)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance and min-redundancy. IEEE Trans. Patt. Anal. Mach. Intell. 27, 1226–1238 (2005)
Qin, A.K., Suganthan, P.N., Loog, M.: Uncorrelated heteoscedastic LDA based on the weighted pairwise Chernoff criterion. Patt. Recogn. 38, 613–616 (2005)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for machine learning. MIT Press, MA (2006)
Raudys, S.J., Jain, A.K.: Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans. Patt. Anal. Mach. Intell. 13, 252–264 (1991)
Ridgeway, G., Madigan, D., Richardson, T., O’Kane, J.W.: Interpretable boosted naive Bayes classification. In: Proc. 4th Int. Conf. Knowledge Discovery Data Mining, New York, pp. 101–104 (1998)
Rish, I., Hellerstein, J., Jayram, T.S.: An analysis of data characteristics that affect naive Bayes performance. IBM Technical Report RC2, IBM Research Division, Thomas J. Watson Research Center, NY 10598 (2001)
Silverman, B.: Density estimation for statistical data analysis. Chapman-Hall (1986)
Sulzmann, J.-N., Fernkranz, J., Hellermeier, E.: On Pairwise Naive Bayes Classifiers. In: Euro. Conf. Mach. Learn., pp. 371–381 (2007)
Tang, E.K., Suganthan, P.N., Yao, X., Qin, A.K.: Linear dimensionality reduction using relevance weighted LDA. Pattern Recognition 38, 485–493 (2005)
Thomas, C.S.: Classifying acute abdominal pain by assuming independence: a study using two models constructed directly from data. Technical Report CSM-153, Department of Computing Science and Mathematics, University of Stirling, Stirling, Scotland (1999)
Thomaz, C.E., Gillies, D.F.: “Small sample size”: A methodological problem in Bayes plug-in classifier for image recognition. Technical Report 6/2001, Department of Computing, Imperial College of Science, Technology and Medicine, London (2001)
Thomaz, C.E., Gillies, D.F., Feitosa, R.Q.: A new covariance estimate for Bayesian classifiers in biometric recognition. IEEE Trans. Circuits Sys. Video Tech. 14, 214–223 (2004)
Trunk, G.V.: A problem of dimensionality: a simple example. IEEE Trans. Patt. Anal. Mach. Intell. 1, 306–307 (1979)
Viaene, S., Derrig, R., Dedene, G.: A case study of applying boosting naive Bayes to claim fraud diagnosis. IEEE Trans. Know Data Enging. 16, 612–619 (2004)
Vilalta, R., Rish, I.: A decomposition of Classes Via Clustering to Explain and Improve Naive Bayes. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 444–455. Springer, Heidelberg (2003)
Webb, A.: Statistical pattern Recognition. Arnold, England (1999)
Webb, G.I., Boughton, J., Wang, Z.: Not so naive Bayes: aggregating one-dependence estimators. Mach. Learn. 58, 5–24 (2005)
Xie, Z., Hsu, W., Liu, Z., Li Lee, M.: SNNB: A Selective Neighborhood Based Naïve Bayes for Lazy Learning. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 104–114. Springer, Heidelberg (2002)
Xie, Z., Zhang, Q.: A study of selective neighbourhood-based naive Bayes for efficient lazy learning. In: Proc. 16th IEEE Int. Conf. Tools Art Intell. ICTAI, Boca Raton, Florida (2004)
Zhang, H.: Exploring conditions for the optimality of naive Bayes. Int. J. Patt. Recog. Art Intell. (2005)
Zheng, Z.: Naive Bayesian Classifier Committee. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 196–207. Springer, Heidelberg (1998)
Zheng, Z., Webb, G.I.: Lazy learning of Bayesian rules. Mach. Learn. 41, 53–87 (2000)
Zheng, Z., Webb, G.I., Ting, K.M.: Lazy Bayesian rules: a lazy semi-naive Bayesian learning technique competitive to boosting decision trees. In: Proc. ICML 1999. Morgan Kaufmann (1999)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mitchell, H.B. (2012). Bayesian Decision Theory. In: Data Fusion: Concepts and Ideas. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27222-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-27222-6_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27221-9
Online ISBN: 978-3-642-27222-6
eBook Packages: EngineeringEngineering (R0)