A Coupling Support Vector Machines with the Feature Learning of Deep Convolutional Neural Networks for Classifying Microarray Gene Expression Data

Huynh, Phuoc-Hai; Nguyen, Van-Hoa; Do, Thanh-Nghi

doi:10.1007/978-3-319-76081-0_20

Phuoc-Hai Huynh⁶,
Van-Hoa Nguyen⁶ &
Thanh-Nghi Do⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 769))

1379 Accesses
10 Citations

Abstract

Support vector machines (SVM) and deep convolutional neural networks (DCNNs) are state-of-the-art classification techniques in many real-world applications. Our investigation aims at proposing a hybrid model combining DCNNs and SVM (called DCNN-SVM) to effectively predict very-high-dimensional gene expression data. The DCNN-SVM trains the DCNNs model to automatically extract features from microarray gene expression data and followed which the DCNN-SVM learns a non-linear SVM model to classify gene expression data. Numerical test results on 15 microarray datasets from Array Expression and Medical Database (Kent Ridge) show that our proposed DCNN-SVM is more accurate than the classical DCNNs algorithm, SVM, random forests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brazma, A., et al.: ArrayExpress a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 31(1), 68–71 (2003)
Article Google Scholar
Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)
Article Google Scholar
Schena, M., et al.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science (New York then Washington) 467–470 (1995)
Google Scholar
Pinkel, D., et al.: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20(2) (1998)
Google Scholar
Brown, M.P.S., et al.: Support vector machine classification of microarray gene expression data. University of California, Santa Cruz, Technical Report UCSC-CRL-99-09 (1999)
Google Scholar
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)
Article MATH Google Scholar
Hasri, N.N.M., et al.: Improved support vector machine using multiple SVM-RFE for cancer classification. Int. J. Adv. Sci. Eng. Inf. Technol. 7(4–2), 1589–1594 (2017)
Google Scholar
Yeang, C.H., Ramaswamy, S., Tamayo, P., Mukherjee, S., Rifkin, R.M., Angelo, M., Reich, M., Lander, E., Mesirov, J., Golub, T.: Molecular classification of multiple tumor types. Bioinformatics 17(suppl-1), S316–S322 (2001)
Google Scholar
Li, J., Liu, H.: Ensembles of cascading trees. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 585–588. IEEE (2003)
Google Scholar
Li, J., Liu, H., Ng, S.K., Wong, L.: Discovery of significant rules for classifying cancer diagnosis data. Bioinformatics 19(suppl-2), ii93–ii102 (2003)
Google Scholar
Tsai, M.H., et al.: A decision tree based classifier to analyze human ovarian cancer cDNA microarray datasets. J. Med. Syst. 40(1), 21 (2016)
Article Google Scholar
Díaz-Uriarte, R., De Andres, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinf. 7(1), 3 (2006)
Article Google Scholar
Do, T.N., Lenca, P., Lallich, S., Pham, N.K.: Classifying very-high-dimensional data with random forests of oblique decision trees. In: Advances in Knowledge Discovery and Management, pp. 39–55. Springer (2010)
Google Scholar
Tan, A.C., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification. Bioinformatics (2003)
Google Scholar
Dettling, M.: Bagboosting for tumor classification with gene expression data. Bioinformatics 20(18), 3583–3593 (2004)
Article Google Scholar
Krizhevsky, A., et al.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. AAAI 333, 2267–2273 (2015)
Google Scholar
Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinf. (2016). https://doi.org/10.1093/bib/bbw068
Suykens, J.A., Vandewalle, J.: Training multilayer perceptron classifiers based on a modified support vector method. IEEE Trans. Neural Netw. 10(4), 907–911 (1999)
Article Google Scholar
Bellili, A., Gilloux, M., Gallinari, P.: An hybrid MLP-SVM handwritten digit recognizer. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition 2001, pp. 28–32. IEEE (2001)
Google Scholar
Niu, X.X., Suen, C.Y.: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognit. 45(4), 1318–1325 (2012)
Article Google Scholar
Nagi, J., et al.: Convolutional neural support vector machines: hybrid visual pattern classifiers for multi-robot systems. In: 2012 11th International Conference on Machine Learning and Applications (ICMLA), vol. 1, pp. 27–32. IEEE (2012)
Google Scholar
Cao, G., Wang, S., Wei, B., Yin, Y., Yang, G.: A hybrid CNN-RF method for electron microscopy images segmentation. Tissue Eng. J. Biomim. Biomater. Tissue Eng. 18, 2 (2013)
Google Scholar
Jinyan, L., Huiqing, L.: Kent ridge bio-medical data set repository (2002)
Google Scholar
Vapnik, V.: Statistical Learning Theory, vol. 1. Wiley, New York (1998)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Hubel, D., Wiesel, T.: Shape and arrangement of columns in cat’s striate cortex. J. Physiol. 165(3), 559–568 (1963)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Burges, C.J.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998)
Article Google Scholar
Kreßel, U.H.G.: Pairwise classification and support vector machines. In: Advances in Kernel Methods, pp. 255–268. MIT press (1999)
Google Scholar
Cristianini, N., Shawe Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge university press (2000)
Google Scholar
Huang, F., LeCun, Y.: Large-scale learning with SVM and convolutional nets for generic object recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available from http://www.tensorflow.org
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Gordon, G.J., et al.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62(17), 4963–4967 (2002)
Google Scholar
Singh, D., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)
Article Google Scholar
Veer, V., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)
Article Google Scholar
Bhattacharjee, A., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. 98(24), 13790–13795 (2001)
Article Google Scholar
Subramanian, A., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 102(43), 15545–15550 (2005)
Article Google Scholar
Wong, T.T.: Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 48(9), 2839–2846 (2015)
Article Google Scholar
Diederik, P., Kingma, J.B.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Hsu, C.W., et al.: A practical guide to support vector classification (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

An Giang University, An Giang, Vietnam
Phuoc-Hai Huynh & Van-Hoa Nguyen
Can Tho University, Can Tho, Vietnam
Thanh-Nghi Do

Authors

Phuoc-Hai Huynh
View author publications
You can also search for this author in PubMed Google Scholar
Van-Hoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Thanh-Nghi Do
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Phuoc-Hai Huynh , Van-Hoa Nguyen or Thanh-Nghi Do .

Editor information

Editors and Affiliations

Department of Information Systems, Wrocław University of Science and Technology, Wrocław, Poland
Andrzej Sieminski
Department of Information Systems, Wrocław University of Science and Technology, Wrocław, Poland
Adrianna Kozierkiewicz
Department of Information Systems and Computing, Complutense University of Madrid, Madrid, Spain
Manuel Nunez
Faculty of Information Technology, Vietnam National University, Hanoi, Vietnam
Quang Thuy Ha

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huynh, PH., Nguyen, VH., Do, TN. (2018). A Coupling Support Vector Machines with the Feature Learning of Deep Convolutional Neural Networks for Classifying Microarray Gene Expression Data. In: Sieminski, A., Kozierkiewicz, A., Nunez, M., Ha, Q. (eds) Modern Approaches for Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 769. Springer, Cham. https://doi.org/10.1007/978-3-319-76081-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-76081-0_20
Published: 24 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76080-3
Online ISBN: 978-3-319-76081-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics