Abstract
We provide a data classification mechanism with missing data handling based on kernel partial least squares (kernel PLS) and discriminant analysis (kernel PLSDA). The novelty of the method is that class variables are used for validation of the missing values imputation. Likewise, this paper is first in utilizing the kernel PLS in handling and classifying missing data. By experimentally comparing the results of different classification methods including missing data handling on three opened biomedical datasets (Arrhythmia, Mammographic Mass, and Pima Indians Diabetes at UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets.html), we found that the proposed kernel PLS plus kernel PLSDA yielded better accuracies than the existing methods.
Similar content being viewed by others
References
Acuna E, Rodriguez C (2004) The treatment of missing values and its effect in the classifier accuracy. In: Proceedings of the meeting of the international federation of classification societies
Alin A (2009) Comparison of PLS algorithms when number of objects is much larger than number of variables. Stat Pap 50:711–720
Alin A, Ali MM (2012) Improved straightforward implementation of a statistically inspired modification of the partial least squares algorithms. Pak J Statist 28(2):217–229
Barker M, Rayens W (2003) Partial least squares for discrimination. J Chemom 17:166–173
Bovaird JA, Kypzyk KA, Maikranz J, Dreyer M, Steele R (2007) Missing data and standard errors with partial least squares. In: Proceedings of the 115th annual meeting of the American Psychological Association, San Francisco
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167
Dayal BS, MacGregor JF (1997) Improved PLS algorithms. J Chemom 11:73–85
De Jong S, Ter Braak CJF (1994) Short communication: comments on the PLS kernel algorithm. J Chemom 8:169–174
Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York
Efron B, Tibshirani RC (1997) Improvements on cross-validation: the 632+ bootstrap method. J Am Stat Assoc 92(438):548–560
Elter M, Schulz-Wendtland R, Wittenberg T (2007) The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize and intelligible decision process. Med Phys 34(11):4164–4172
Fernandez-Delgado M, Cernadas E, Barro S (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181
Guvenir HA, Acar B, Demiroz G, Cekin A (1997) A supervised machine learning algorithm for arrhythmia analysis. In: Proceedings of the computers in cardiology conference. pp 433–436
Lindgren F, Geladi P, Wold S (1993) The kernel algorithm for PLS. J Chemom 7:45–59
Oliveira ALI, Medeiros EA, Rocha TABV, Bezerra MER, Veras RC (2006) On the influence of parameter \(\theta ^{-}\) on performance of RBF neural network trained with the dynamic decay adjustment algorithm. Int J Neural Syst 16(4):271–281
Pappa GL, Freitas AA, Kaestner CAA (2002) Attribute selection with a multi-objective genetic algorithm. Brazilian Symposium on artifical intelligence, pp 280–290
Pelckmans K, De Brabanter J, Suykens JAK, De Moor B (2005) Handling missing values in support vector machine classifiers. Neural Netw 18:684–692
Perez-Enciso M, Tenenhaus M (2003) Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach. Hum Genet 112(5–6):581–592
Prechelt L (1994) PROBEN1 - A set of neural network benchmark problems and benchmarking rules. Technical Report 21/94, Fakultat fur Informatik, Universitat Karlsruhe
Rosipal R (2011) Nonlinear partial least squares: an overview. In: Lodhi H, Yamanishi Y (eds) Complex computational methods and collaborative techniques., Chemoinformatics and advance machine learning perspectivescomplex IGI global, Hershey, PA, pp 168–189
Rubin H, Witkiewitz K, St. Andre J, Reilly S (2007) Methods for handling missing data in the behavioral neurosciences: don’t throw the baby rat out with the bath water. J Undergrad Neurosci Educ 5(2):A71–A77
Scheffer J (2002) Dealing with missing data. Res Lett Inf Math Sci 3:153–160
Smith JW, Everhart JE, Dickson WC, Knowler WC, Johannes RS (1988) Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the symposium on computer applications and medical care, pp 261–265
Van Gestel T, Suykens JAK, Baesens B, Viaene S, Vanthienen J, Guido Dedene, De Moor B, Vandewalle J (2004) Benchmarking least squares support vector machine classifiers. Mach Learn 54(1):5–32
Acknowledgments
This work was supported by the National Research foundation of Korea (NRF) grant funded by the Korea government (MSIP)(No. 2007–00559), Gyeonggi–do and KISTI. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was done when T. T. Nguyen worked at Institut Pasteur Korea, South Korea.
Rights and permissions
About this article
Cite this article
Nguyen, T.T., Tsoy, Y. A kernel PLS based classification method with missing data handling. Stat Papers 58, 211–225 (2017). https://doi.org/10.1007/s00362-015-0694-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-015-0694-y