Hybrid Classification of High-Dimensional Biomedical Tumour Datasets

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 386)

Abstract

This paper concerns hybrid approach to classification of high-dimensional tumour data. The research presents a comparison of hybrid classification methods: bagging with Naive Bayes (NaiveBayes), IBk, J48 and SMO as base classifiers, random forest as a variant of bagging with a decision tree as a base classifier, boosting with NaiveBayes, SMO, IBk and J48 as base classifiers, and voting by all single classifiers using majority as a combination rule, as well as five single classification strategies, including k-nearest neighbours (IBk), J48, NaiveBayes, random tree and sequential minimal optimization algorithm for training support vector machines. The major conclusion drawn from the study was that hybrid classifiers has demonstrated its potential ability to accurately and efficiently classify both binary and multiclass high-dimensional sets of tumour specimens.

Keywords

Hybrid classification Ensemble classifiers High-dimensional datasets Tumour classification 

References

  1. 1.
    Breiman, L.: Bagging Predictors. Technical Report 421, Department of Statistics, University of California, Berkeley (1994)Google Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Mach. Learn. 26(2), 123–140 (1996)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)MATHCrossRefGoogle Scholar
  4. 4.
    Dziomdziora A.: Comparative Study of Feature Selection Methods for High-dimensional Biomedical Datasets (Masters thesis supervised by A. Wosiak), Łódz Unversity of Technology, Łódz, Poland (2014)Google Scholar
  5. 5.
    Elshazly, H.I., Elkorany, A.M., Hassanien, A.E., Azar, A.T.: Ensemble classifiers for biomedical data: performance evaluation. In: Proceedings of the 9th International Conference on Computer Engineering & Systems (ICCES), pp. 184–189 (2013)Google Scholar
  6. 6.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference in Machine Learning, pp. 325–332 (1996)Google Scholar
  7. 7.
    Freund, Y., Schapire, R.E.: A decisiontheoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MATHMathSciNetCrossRefGoogle Scholar
  8. 8.
    Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man, Cybern. Part C: Appl. Rev. 42(4), 463–484 (2012). doi: 10.1109/TSMCC.2011.2161285 CrossRefGoogle Scholar
  9. 9.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)MATHMathSciNetCrossRefGoogle Scholar
  10. 10.
    Kuncheva, L.I.: Combining pattern classifiers, methods and algorithms. Wiley, Hoboken (2004)MATHCrossRefGoogle Scholar
  11. 11.
    Li, X., Lu, H., Wang, M.: A Hybrid gene selection method for multi-category tumor classification using microarray data. Int. J. Bioautomation 17(4), 249–258 (2013)Google Scholar
  12. 12.
    Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)CrossRefGoogle Scholar
  13. 13.
    Mendialdua, I., Arruti, A., Jauregi, E., Lazkano, E., Sierra, B.: Classifier subset selection to construct multi-classifiers by means of estimation of distribution algorithms. Neurocomputing 157, 46–60 (2015)MATHCrossRefGoogle Scholar
  14. 14.
    Michalski, R.S., Tecuci, G.: Machine learning: a multistrategy approach. J. Morgan Kaufmann (1994)Google Scholar
  15. 15.
    Reboiro-Jato, M., Díaz, F., Glez-Peña, D., Fdez-Riverola, F.: A novel ensemble of classifiers that use biological relevant gene sets for microarray classification. Appl. Soft Comput. 17, 117–126 (2014)CrossRefGoogle Scholar
  16. 16.
    Rokach, L.: Pattern classification using ensemble methods. World Scientific Publishing Co. Inc, River Edge (2010)MATHGoogle Scholar
  17. 17.
    Son, H., Kim, C., Hwang, N., Kim, C., Kang, Y.: Classification of major construction materials in construction environments using ensemble classifiers. Adv. Eng. Inf. 28(1), 1–10 (2014)CrossRefGoogle Scholar
  18. 18.
    Tiwari, M.: Microarrays and cancer diagnosis. J. Cancer Res. Ther. 8(1), 3–10 (2012)MATHCrossRefGoogle Scholar
  19. 19.
    Wang, X., Gotoh, O.: A robust gene selection method for microarray-based cancer classification. Cancer Inf. 9, 15–30 (2010)CrossRefGoogle Scholar
  20. 20.
    Wang, S.L., Li, X.L., Fang, J.: Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumour classification. BMC Bioinformatics 13(178), 1–26 (2012)MATHMathSciNetCrossRefGoogle Scholar
  21. 21.
    Wang, Y., Tetko, I.V., Hall, M.A., Frank, E., Facius, A., Mayer, K.F.: Gene selection from microarray data for cancer classification—a machine learning approach. Comput. Biol. Chem. 29, 37–46 (2005)MATHCrossRefGoogle Scholar
  22. 22.
    Wolpert, D.H.: The supervised learning no-free-lunch. In: 6th Online World Conference on Theorems, Soft Computing in Industrial Applications, pp. 25–42 (2001)Google Scholar
  23. 23.
    Wosiak, A., Dziomdziora, A.: On Pairwise combinations of feature selection and classification methods for high-dimensional tumour biomedical datasets. Schedae Informaticae, 24 (Ahead of Print) (2015). doi: 10.4467/20838476SI.15.005.3027
  24. 24.
    Wozniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion pp. 3–17 (2014). doi: 10.1016/j.inffus.2013.04.006
  25. 25.
    Wozniak, M., Kasprzak, A.: Data stream classification using classifier ensemble. Schedae Informaticae 23 (Ahead of Print) (2014). doi: 10.4467/20838476SI.14.002.3019
  26. 26.
    Zhang, X.W., Yap, J.L., Wei, D., Chen, F., Danchin, A.: Molecular diagnosis of human cancer type by gene expression profiles and independent component analysis. Eur. J. Hum. Genet. 13(12), 1303–1311 (2005)MATHCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.University of Computer Sciences and SkillsLodzPoland
  2. 2.Institute of Information TechnologyLodz University of TechnologyLodzPoland

Personalised recommendations