Advertisement

A Coupling Support Vector Machines with the Feature Learning of Deep Convolutional Neural Networks for Classifying Microarray Gene Expression Data

  • Phuoc-Hai HuynhEmail author
  • Van-Hoa NguyenEmail author
  • Thanh-Nghi DoEmail author
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 769)

Abstract

Support vector machines (SVM) and deep convolutional neural networks (DCNNs) are state-of-the-art classification techniques in many real-world applications. Our investigation aims at proposing a hybrid model combining DCNNs and SVM (called DCNN-SVM) to effectively predict very-high-dimensional gene expression data. The DCNN-SVM trains the DCNNs model to automatically extract features from microarray gene expression data and followed which the DCNN-SVM learns a non-linear SVM model to classify gene expression data. Numerical test results on 15 microarray datasets from Array Expression and Medical Database (Kent Ridge) show that our proposed DCNN-SVM is more accurate than the classical DCNNs algorithm, SVM, random forests.

Keywords

Microarray gene expression Convolutional neural networks Support vector machines 

References

  1. 1.
    Brazma, A., et al.: ArrayExpress a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 31(1), 68–71 (2003)CrossRefGoogle Scholar
  2. 2.
    Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)CrossRefGoogle Scholar
  3. 3.
    Schena, M., et al.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science (New York then Washington) 467–470 (1995)Google Scholar
  4. 4.
    Pinkel, D., et al.: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20(2) (1998)Google Scholar
  5. 5.
    Brown, M.P.S., et al.: Support vector machine classification of microarray gene expression data. University of California, Santa Cruz, Technical Report UCSC-CRL-99-09 (1999)Google Scholar
  6. 6.
    Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)CrossRefGoogle Scholar
  7. 7.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)CrossRefzbMATHGoogle Scholar
  8. 8.
    Hasri, N.N.M., et al.: Improved support vector machine using multiple SVM-RFE for cancer classification. Int. J. Adv. Sci. Eng. Inf. Technol. 7(4–2), 1589–1594 (2017)Google Scholar
  9. 9.
    Yeang, C.H., Ramaswamy, S., Tamayo, P., Mukherjee, S., Rifkin, R.M., Angelo, M., Reich, M., Lander, E., Mesirov, J., Golub, T.: Molecular classification of multiple tumor types. Bioinformatics 17(suppl-1), S316–S322 (2001)Google Scholar
  10. 10.
    Li, J., Liu, H.: Ensembles of cascading trees. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 585–588. IEEE (2003)Google Scholar
  11. 11.
    Li, J., Liu, H., Ng, S.K., Wong, L.: Discovery of significant rules for classifying cancer diagnosis data. Bioinformatics 19(suppl-2), ii93–ii102 (2003)Google Scholar
  12. 12.
    Tsai, M.H., et al.: A decision tree based classifier to analyze human ovarian cancer cDNA microarray datasets. J. Med. Syst. 40(1), 21 (2016)CrossRefGoogle Scholar
  13. 13.
    Díaz-Uriarte, R., De Andres, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinf. 7(1), 3 (2006)CrossRefGoogle Scholar
  14. 14.
    Do, T.N., Lenca, P., Lallich, S., Pham, N.K.: Classifying very-high-dimensional data with random forests of oblique decision trees. In: Advances in Knowledge Discovery and Management, pp. 39–55. Springer (2010)Google Scholar
  15. 15.
    Tan, A.C., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification. Bioinformatics (2003)Google Scholar
  16. 16.
    Dettling, M.: Bagboosting for tumor classification with gene expression data. Bioinformatics 20(18), 3583–3593 (2004)CrossRefGoogle Scholar
  17. 17.
    Krizhevsky, A., et al.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  18. 18.
    Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. AAAI 333, 2267–2273 (2015)Google Scholar
  19. 19.
    Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinf. (2016).  https://doi.org/10.1093/bib/bbw068
  20. 20.
    Suykens, J.A., Vandewalle, J.: Training multilayer perceptron classifiers based on a modified support vector method. IEEE Trans. Neural Netw. 10(4), 907–911 (1999)CrossRefGoogle Scholar
  21. 21.
    Bellili, A., Gilloux, M., Gallinari, P.: An hybrid MLP-SVM handwritten digit recognizer. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition 2001, pp. 28–32. IEEE (2001)Google Scholar
  22. 22.
    Niu, X.X., Suen, C.Y.: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognit. 45(4), 1318–1325 (2012)CrossRefGoogle Scholar
  23. 23.
    Nagi, J., et al.: Convolutional neural support vector machines: hybrid visual pattern classifiers for multi-robot systems. In: 2012 11th International Conference on Machine Learning and Applications (ICMLA), vol. 1, pp. 27–32. IEEE (2012)Google Scholar
  24. 24.
    Cao, G., Wang, S., Wei, B., Yin, Y., Yang, G.: A hybrid CNN-RF method for electron microscopy images segmentation. Tissue Eng. J. Biomim. Biomater. Tissue Eng. 18, 2 (2013)Google Scholar
  25. 25.
    Jinyan, L., Huiqing, L.: Kent ridge bio-medical data set repository (2002)Google Scholar
  26. 26.
    Vapnik, V.: Statistical Learning Theory, vol. 1. Wiley, New York (1998)zbMATHGoogle Scholar
  27. 27.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  28. 28.
    Hubel, D., Wiesel, T.: Shape and arrangement of columns in cat’s striate cortex. J. Physiol. 165(3), 559–568 (1963)CrossRefGoogle Scholar
  29. 29.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  30. 30.
    Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)CrossRefGoogle Scholar
  31. 31.
    Kreßel, U.H.G.: Pairwise classification and support vector machines. In: Advances in Kernel Methods, pp. 255–268. MIT press (1999)Google Scholar
  32. 32.
    Cristianini, N., Shawe Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge university press (2000)Google Scholar
  33. 33.
    Huang, F., LeCun, Y.: Large-scale learning with SVM and convolutional nets for generic object recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)Google Scholar
  34. 34.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
  35. 35.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available from http://www.tensorflow.org
  36. 36.
    Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  37. 37.
    Gordon, G.J., et al.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62(17), 4963–4967 (2002)Google Scholar
  38. 38.
    Singh, D., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)CrossRefGoogle Scholar
  39. 39.
    Veer, V., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)CrossRefGoogle Scholar
  40. 40.
    Bhattacharjee, A., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. 98(24), 13790–13795 (2001)CrossRefGoogle Scholar
  41. 41.
    Subramanian, A., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 102(43), 15545–15550 (2005)CrossRefGoogle Scholar
  42. 42.
    Wong, T.T.: Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 48(9), 2839–2846 (2015)CrossRefGoogle Scholar
  43. 43.
    Diederik, P., Kingma, J.B.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2014)Google Scholar
  44. 44.
    Hsu, C.W., et al.: A practical guide to support vector classification (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.An Giang UniversityAn GiangVietnam
  2. 2.Can Tho UniversityCan ThoVietnam

Personalised recommendations