Robust Bayesian Classification with Incomplete Data

Zhang, Xunan; Song, Shiji; Wu, Cheng

doi:10.1007/s12559-012-9188-6

Robust Bayesian Classification with Incomplete Data

Published: 21 September 2012

Volume 5, pages 170–187, (2013)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Xunan Zhang¹,
Shiji Song¹ &
Cheng Wu¹

631 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we address the Bayesian classification with incomplete data. The common approach in the literature is to simply ignore the samples with missing values or impute missing values before classification. However, these methods are not effective when a large portion of the data have missing values and the acquisition of samples is expensive. Motivated by these limitations, the expectation maximization algorithm for learning a multivariate Gaussian mixture model and a multiple kernel density estimator based on the propensity scores are proposed to avoid listwise deletion (LD) or mean imputation (MI) for solving classification tasks with incomplete data. We illustrate the effectiveness of our proposed algorithms on some artificial and benchmark UCI data sets by comparing with LD and MI methods. We also apply these algorithms to solve the practical classification tasks on the lithology identification of hydrothermal minerals and license plate character recognition. The experimental results demonstrate their good performance with high classification accuracies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering with missing features: a penalized dissimilarity measure based approach

Article 12 June 2018

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

Gaussian kernel with correlated variables for incomplete data

Article 28 November 2023

References

Boukharouba A, Bennia A. Recognition of handwritten Arabic literal amounts using a hybrid approach. Cogn Comput. 2011; 3(2): 382–393.
Article Google Scholar
Tay NW, Loo CK, Perus M. Face recognition with quantum associative networks using overcomplete Gabor wavelet. Cogn Comput. 2010; 2(4): 297–302.
Article Google Scholar
Salberg AB. Land cover classification of cloud-contaminated multitemporal high-resolution images. IEEE Trans Geosci Remote Sens. 2011; 49(1): 377–387.
Article Google Scholar
Loizou A and Laouris Y. Developing prognosis tools to identify learning difficulties in children using machine learning technologies. Cogn Comput. 2011; 3(3): 50–490.
Article Google Scholar
Ng AY, Jordan MI. On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. In: Advances in neural information processing systems. Cambridge: MIT Press; 2002. p. 841–848.
Guo L, Wu YX, Zhao L, et al. Classification of mental task from EEG signals using immune feature weighted support vector machines. IEEE Trans Magn. 2011; 47(5): 866–869.
Article Google Scholar
Zhang HG, Liu JH, Ma DZ, et al. Data-core-based fuzzy min-max neural network for pattern classification. IEEE Trans Neural Netw. 2011; 22(12): 2339–2352.
Article PubMed Google Scholar
Raina R, Shen YR, Ng AY, et al. Classification with hybrid generative/discriminative models. In: Advances in neural information processing systems. Cambridge: MIT Press; 2004. p. 545–552.
Dalton LA, Dougherty ER. Exact sample conditioned MSE performance of the Bayesian MMSE estimator for classification error—part I: representation. IEEE Trans Signal Process. 2012; 60(5): 2575–2587.
Article Google Scholar
Baram Y. Bayesian classification by iterated weighting. Neurocomputing 1999; 25(1–3): 73–79.
Article Google Scholar
Hoare Z. Landscapes of naive Bayes classifiers. Pattern Anal Appl. 2008; 11(1): 59–72.
Article Google Scholar
Garcia-Laencina PJ, Sancho-Gomez JL, Figueiras-Vidal AR. Pattern classification with missing data: a review. Neural Comput Appl. 2010; 19(2): 263–282.
Article Google Scholar
Schafer JL, Graham JW. Missing data: Our view of the state of the art. Psychol Methods. 2002; 7(2): 147–177.
Article PubMed Google Scholar
Williams D, Liao XJ, Xue Y, et al. On classification with incomplete data. IEEE Trans Pattern Anal Mach Intell. 2007; 29(3): 427–436.
Article PubMed Google Scholar
Little RJA, Rubin DB. Statistical analysis with missing data, 2nd ed. New Jersey: Wiley; 2002.
Google Scholar
Jerez JM, Molina I, Garcia-Laencina PJ, et al. Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med. 2010; 50(2): 105–115.
Article PubMed Google Scholar
Gheyas IA, Smith LS. A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 2010; 73(16–18): 3039–3065.
Article Google Scholar
Silva-Ramirez EL, Pino-Mejias R, Lopez-Coello M, et al. Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw. 2011; 24(1): 121–129.
Article PubMed Google Scholar
Ghannad-Rezaie M, Soltanian-Zadeh H, Ying H, et al. Selection-fusion approach for classification of datasets with missing values. Pattern Recognit. 2010; 43(6): 2340–2350.
Article PubMed Google Scholar
Parthasarathy S, Aggarwal CC. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Trans Knowl Data Eng. 2003; 15(6): 1512–1521.
Article Google Scholar
Bilmes JA. A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden markov models. Technical report, Berkeley: University of Berkeley, TR-97-021, 1998.
Bishop CM. Pattern recognition and machine learning (Information Science and Statistics). Secaucus: Springer; 2006.
Google Scholar
Simonoff JS. Smoothing methods in statistic. Berlin: Springer; 1996.
Book Google Scholar
Gang F, Shih FY, Haimin W. A kernel-based parametric method for conditional density estimation. Pattern Recognit. 2011; 44(2): 284–294.
Article Google Scholar
Duda RO. Hart PE, Stork DG. Pattern classification, 2nd ed. Lonon: Wiley-Interscience; 2000.
Google Scholar
Dutta S. Estimation of the MISE and the optimal bandwidth vector of a product kernel density estimate. J Stat Plan Inference. 2011; 141(5): 1817–1831.
Article Google Scholar
Nasios N, Bors AG. Kernel-based classification using quantum mechanics. Pattern Recognit. 2007; 40(3): 875–889.
Article Google Scholar
Tang W, He H, Gunzler D. Kernel smoothing density estimation when group membership is subject to missing. J Stat Plan Inference. 2012; 142(3): 685–694.
Article PubMed Google Scholar
Jones MC, Marron JS, Sheather SJ. A brief survey of bandwidth selection for density estimation. J Am Stat Assoc. 1996; 91(433): 401–407.
Article Google Scholar
Kristan M, Leonardis A, Skocaj D. Multivariate online kernel density estimation with Gaussian kernels. Pattern Recognit. 2011; 44(10–11): 2630–2642.
Article Google Scholar
Lin TI, Lee JC, Ho HJ. On fast supervised learning for normal mixture models with missing information. Pattern Recognit. 2006; 39(6): 1177–1187.
Article Google Scholar
Hathaway RJ, Bezdek JC. Fuzzy c-means clustering of incomplete data. IEEE Trans Syst Man Cybern Part B-Cybern. 2001; 31(5): 735–744.
Article CAS Google Scholar
Wang QH. Probability density estimation with data missing at random when covariables are present. J Stat Plan Inference. 2008; 138(3): 568–587.
Article Google Scholar
Dubnicka SR. Kernel density estimation with missing data and auxiliary variables. Aust. N Z J Stat. 2009; 51(3): 247–270.
Article Google Scholar
UCI machine learning repository, 2012. [Online]. Available: http://archive.ics.uci.edu/ml/datasets.html.
Zhang XN, Song SJ, Li JB, et al. LS-SVR method of ore grade estimation in Solwara 1 region with missing data. J Central S Univ. 2011; 42(suppl.2): 147–155.
Google Scholar
Chang SL, Chen LS, Chung YC, et al. Automatic license plate recognition. IEEE Trans Intell Transp Syst. 2004; 5(1): 42–53.
Article Google Scholar
Abolghasemi V and Ahmadyfard A. An edge-based color-aided method for license plate detection. Image Vis Comput. 2009; 27(8): 1134–1142.
Article Google Scholar
Shivaswamy PK, Bhattacharyya C, Smola AJ. Second order cone programming approaches for handling missing and uncertain data. J Mach Learn Res. 2006; 7: 1283–1314.
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant 61273233, the Project of China Ocean Association under Grant DYXM-125-25-02, and Tsinghua University Initiative Scientific Research Program under Grants 2010THZ07002 and 2011THZ07132.

Author information

Authors and Affiliations

Department of Automation, Tsinghua University, Beijing, 100084, China
Xunan Zhang, Shiji Song & Cheng Wu

Authors

Xunan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shiji Song
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiji Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Song, S. & Wu, C. Robust Bayesian Classification with Incomplete Data. Cogn Comput 5, 170–187 (2013). https://doi.org/10.1007/s12559-012-9188-6

Download citation

Received: 28 April 2012
Accepted: 02 September 2012
Published: 21 September 2012
Issue Date: June 2013
DOI: https://doi.org/10.1007/s12559-012-9188-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Bayesian Classification with Incomplete Data

Abstract

Access this article

Similar content being viewed by others

Clustering with missing features: a penalized dissimilarity measure based approach

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

Gaussian kernel with correlated variables for incomplete data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust Bayesian Classification with Incomplete Data

Abstract

Access this article

Similar content being viewed by others

Clustering with missing features: a penalized dissimilarity measure based approach

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

Gaussian kernel with correlated variables for incomplete data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation