Skip to main content

Exploiting Ladder Networks for Gene Expression Classification

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2018)

Abstract

The application of deep learning to biology is of increasing relevance, but it is difficult; one of the main difficulties is the lack of massive amounts of training data. However, some recent applications of deep learning to the classification of labeled cancer datasets have been successful. Along this direction, in this paper, we apply Ladder networks, a recent and interesting network model, to the binary cancer classification problem; our results improve over the state of the art in deep learning and over the conventional state of the art in machine learning; achieving such results required a careful adaptation of the available datasets and tuning of the network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Danaee, P., Ghaeini, R., Hendrix, D.A.: A deep learning approach for cancer detection and relevant gene identification. In: Pacific Symposium on Biocomputing, pp. 219–229. World Scientific (2017)

    Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  3. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  4. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  5. Singh, R., Lanchantin, J., Robins, G., Qi, Y.: DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32(17), i639–i648 (2016)

    Article  Google Scholar 

  6. Chakraborty, S., Ghosh, M., Mallick, B.K.: Bayesian non-linear regression for large p small n problems. J. Am. Stat. Assoc. (2005)

    Google Scholar 

  7. Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning, 1st edn. The MIT Press, Cambridge (2010)

    Google Scholar 

  8. Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems, pp. 3546–3554 (2015)

    Google Scholar 

  9. Masseroli, M., Pinoli, P., Venco, F., Kaitoua, A., Jalili, V., Palluzzi, F., Muller, H., Ceri, S.: GenoMetric Query Language: a novel approach to large-scale genomic data management. Bioinformatics 31(12), 1881–1888 (2015)

    Article  Google Scholar 

  10. Weinstein, J.N., Collisson, E.A., Mills, G.B., Shaw, K.R.M., Ozenberger, B.A., Ellrott, K., Shmulevich, I., Sander, C., Stuart, J.M., Cancer Genome Atlas Research Network, et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Google Scholar 

  11. Cumbo, F., Fiscon, G., Ceri, S., Masseroli, M., Weitschek, E.: TCGA2BED: extracting, extending, integrating, and querying the cancer genome atlas. BMC Bioinform. 18(1), 6 (2017)

    Article  Google Scholar 

  12. Li, B., Dewey, C.N.: RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinform. 12(1), 323 (2011)

    Article  Google Scholar 

  13. Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3(Mar), 1289–1305 (2003)

    MATH  Google Scholar 

  14. Jolliffe, I.T.: Principal component analysis and factor analysis. In: Principal Component Analysis, pp. 115–128. Springer, New York (1986). https://doi.org/10.1007/978-1-4757-1904-8_7

  15. Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 583–588. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0020217

    Google Scholar 

  16. Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Metagenes and molecular pattern discovery using matrix factorization. Proc. Nat. Acad. Sci. 101(12), 4164–4169 (2004)

    Article  Google Scholar 

  17. Vapnik, V., Cortes, C.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  18. Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  19. Tuncel, M.A.: A statistical framework for the analysis of genomic data. Master’s thesis, Politechnico di Milano (2017)

    Google Scholar 

  20. Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science & Business Media, New York (2000). https://doi.org/10.1007/978-1-4757-3264-1

    Book  MATH  Google Scholar 

  21. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)

    Article  MATH  Google Scholar 

  22. Wei, J.S., Greer, B.T., Westermann, F., Steinberg, S.M., Son, C.G., Chen, Q.R., Whiteford, C.C., Bilke, S., Krasnoselsky, A.L., Cenacchi, N., et al.: Prediction of clinical outcome using gene expression profiling and artificial neural networks for patients with neuroblastoma. Cancer Res. 64(19), 6883–6891 (2004)

    Article  Google Scholar 

  23. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7(6), 673–679 (2001)

    Article  Google Scholar 

  24. Vohradsky, J.: Neural network model of gene expression. FASEB J. 15(3), 846–854 (2001)

    Article  Google Scholar 

  25. Deng, L.: The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)

    Article  Google Scholar 

  26. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). tensorflow.org

  27. Refaeilzadeh, P., Tang, L., Liu, H.: Cross-validation. In: Encyclopedia of Database Systems, pp. 532–538. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-39940-9

  28. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgment

This work was supported by the ERC Advanced Grant GeCo (Data-Driven Genomic Computing) (Grant No. 693174) awarded to Prof. Stefano Ceri.

We thank Prof. Stefano Ceri who provided insight and expertise that greatly assisted the research and comments that greatly improved the manuscript.

We would like to thank also members of the GeCo project for helpful insights.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arif Canakoglu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Golcuk, G., Tuncel, M.A., Canakoglu, A. (2018). Exploiting Ladder Networks for Gene Expression Classification. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2018. Lecture Notes in Computer Science(), vol 10813. Springer, Cham. https://doi.org/10.1007/978-3-319-78723-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78723-7_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78722-0

  • Online ISBN: 978-3-319-78723-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics