Hybrid Classification of High-Dimensional Biomedical Tumour Datasets
Abstract
This paper concerns hybrid approach to classification of high-dimensional tumour data. The research presents a comparison of hybrid classification methods: bagging with Naive Bayes (NaiveBayes), IBk, J48 and SMO as base classifiers, random forest as a variant of bagging with a decision tree as a base classifier, boosting with NaiveBayes, SMO, IBk and J48 as base classifiers, and voting by all single classifiers using majority as a combination rule, as well as five single classification strategies, including k-nearest neighbours (IBk), J48, NaiveBayes, random tree and sequential minimal optimization algorithm for training support vector machines. The major conclusion drawn from the study was that hybrid classifiers has demonstrated its potential ability to accurately and efficiently classify both binary and multiclass high-dimensional sets of tumour specimens.
Keywords
Hybrid classification Ensemble classifiers High-dimensional datasets Tumour classificationReferences
- 1.Breiman, L.: Bagging Predictors. Technical Report 421, Department of Statistics, University of California, Berkeley (1994)Google Scholar
- 2.Breiman, L.: Bagging predictors. Mach. Learn. 26(2), 123–140 (1996)Google Scholar
- 3.Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)MATHCrossRefGoogle Scholar
- 4.Dziomdziora A.: Comparative Study of Feature Selection Methods for High-dimensional Biomedical Datasets (Masters thesis supervised by A. Wosiak), Łódz Unversity of Technology, Łódz, Poland (2014)Google Scholar
- 5.Elshazly, H.I., Elkorany, A.M., Hassanien, A.E., Azar, A.T.: Ensemble classifiers for biomedical data: performance evaluation. In: Proceedings of the 9th International Conference on Computer Engineering & Systems (ICCES), pp. 184–189 (2013)Google Scholar
- 6.Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference in Machine Learning, pp. 325–332 (1996)Google Scholar
- 7.Freund, Y., Schapire, R.E.: A decisiontheoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MATHMathSciNetCrossRefGoogle Scholar
- 8.Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man, Cybern. Part C: Appl. Rev. 42(4), 463–484 (2012). doi: 10.1109/TSMCC.2011.2161285 CrossRefGoogle Scholar
- 9.Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)MATHMathSciNetCrossRefGoogle Scholar
- 10.Kuncheva, L.I.: Combining pattern classifiers, methods and algorithms. Wiley, Hoboken (2004)MATHCrossRefGoogle Scholar
- 11.Li, X., Lu, H., Wang, M.: A Hybrid gene selection method for multi-category tumor classification using microarray data. Int. J. Bioautomation 17(4), 249–258 (2013)Google Scholar
- 12.Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)CrossRefGoogle Scholar
- 13.Mendialdua, I., Arruti, A., Jauregi, E., Lazkano, E., Sierra, B.: Classifier subset selection to construct multi-classifiers by means of estimation of distribution algorithms. Neurocomputing 157, 46–60 (2015)MATHCrossRefGoogle Scholar
- 14.Michalski, R.S., Tecuci, G.: Machine learning: a multistrategy approach. J. Morgan Kaufmann (1994)Google Scholar
- 15.Reboiro-Jato, M., Díaz, F., Glez-Peña, D., Fdez-Riverola, F.: A novel ensemble of classifiers that use biological relevant gene sets for microarray classification. Appl. Soft Comput. 17, 117–126 (2014)CrossRefGoogle Scholar
- 16.Rokach, L.: Pattern classification using ensemble methods. World Scientific Publishing Co. Inc, River Edge (2010)MATHGoogle Scholar
- 17.Son, H., Kim, C., Hwang, N., Kim, C., Kang, Y.: Classification of major construction materials in construction environments using ensemble classifiers. Adv. Eng. Inf. 28(1), 1–10 (2014)CrossRefGoogle Scholar
- 18.Tiwari, M.: Microarrays and cancer diagnosis. J. Cancer Res. Ther. 8(1), 3–10 (2012)MATHCrossRefGoogle Scholar
- 19.Wang, X., Gotoh, O.: A robust gene selection method for microarray-based cancer classification. Cancer Inf. 9, 15–30 (2010)CrossRefGoogle Scholar
- 20.Wang, S.L., Li, X.L., Fang, J.: Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumour classification. BMC Bioinformatics 13(178), 1–26 (2012)MATHMathSciNetCrossRefGoogle Scholar
- 21.Wang, Y., Tetko, I.V., Hall, M.A., Frank, E., Facius, A., Mayer, K.F.: Gene selection from microarray data for cancer classification—a machine learning approach. Comput. Biol. Chem. 29, 37–46 (2005)MATHCrossRefGoogle Scholar
- 22.Wolpert, D.H.: The supervised learning no-free-lunch. In: 6th Online World Conference on Theorems, Soft Computing in Industrial Applications, pp. 25–42 (2001)Google Scholar
- 23.Wosiak, A., Dziomdziora, A.: On Pairwise combinations of feature selection and classification methods for high-dimensional tumour biomedical datasets. Schedae Informaticae, 24 (Ahead of Print) (2015). doi: 10.4467/20838476SI.15.005.3027
- 24.Wozniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion pp. 3–17 (2014). doi: 10.1016/j.inffus.2013.04.006
- 25.Wozniak, M., Kasprzak, A.: Data stream classification using classifier ensemble. Schedae Informaticae 23 (Ahead of Print) (2014). doi: 10.4467/20838476SI.14.002.3019
- 26.Zhang, X.W., Yap, J.L., Wei, D., Chen, F., Danchin, A.: Molecular diagnosis of human cancer type by gene expression profiles and independent component analysis. Eur. J. Hum. Genet. 13(12), 1303–1311 (2005)MATHCrossRefGoogle Scholar