Skip to main content

An Application of Low Bias Bagged SVMs to the Classification of Heterogeneous Malignant Tissues

  • Conference paper
Neural Nets (WIRN 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2859))

Included in the following conference series:

Abstract

DNA microarray data are characterized by high-dimensional and low-sized samples, as only few tens of DNA microarray experiments, involving each one thousands of genes, are usually available for data processing. Considering also the large biological variability of gene expression and the noise introduced by the bio-technological machinery, we need robust and variance-reducing data analysis methods. To this purpose, we propose an application of a new ensemble method based on the bias–variance decomposition of the error, using Support Vector Machines (SVMs) as base learners. This approach, that we named Low bias bagging (Lobag), tries to reduce both the bias and the variance components of the error, selecting the base learners with the lowest bias, and combining them through bootstrap aggregating techniques. We applied Lobag to the classification of normal and heterogeneous malignant tissues, using DNA microarray gene expression data. Preliminary results on this challenging two-class classification problem show that Lobag, in association with simple feature selection methods, outperforms both single and bagged ensembles of SVMs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  2. Brown, M., et al.: Knowledge-base analysis of microarray gene expression data by using support vector machines. PNAS 97(1), 262–267 (2000)

    Article  Google Scholar 

  3. Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Computation (7), 1895–1924 (1998)

    Article  Google Scholar 

  4. Domingos, P.: A unified bias–variance decomposition. Technical report, Department of Computer Science and Engineering, University of Washington, Seattle, WA (2000)

    Google Scholar 

  5. Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. JASA 97(457), 77–87 (2002)

    MATH  MathSciNet  Google Scholar 

  6. Eisen, M., Brown, P.: DNA arrays for analysis of gene expression. Methods Enzymol. 303, 179–205 (1999)

    Article  Google Scholar 

  7. Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  8. Golub, T.R., et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  9. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46(1/3), 389–422 (2002)

    Article  MATH  Google Scholar 

  10. Khan, J., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7(6), 673–679 (2001)

    Article  Google Scholar 

  11. Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. PNAS 98(26), 15149–15154 (2001)

    Article  Google Scholar 

  12. Valentini, G.: Gene expression data analysis of human lymphoma using support vector machines and output coding ensembles. Artificial Intelligence in Medicine 26(3), 283–306 (2002)

    Article  Google Scholar 

  13. Valentini, G., Dietterich, T.G.: Bias–variance analysis and ensembles of SVM. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 222–231. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Valentini, G., Dietterich, T.G.: Low Bias Bagged Support Vector Machines. In: Proc. ICML 2003, The Twentieth International Conference on Machine Learning, Washington D.C., USA (2003)

    Google Scholar 

  15. Valentini, G., Masulli, F.: NEURObjects: an object-oriented library for neural network development. Neurocomputing 48(1–4), 623–646 (2002)

    Article  MATH  Google Scholar 

  16. Valentini, G., Muselli, M., Ruffino, F.: Bagged Ensembles of SVMs for Gene Expression Data Analysis. In: IJCNN 2003, The IEEE-INNS-ENNS International Joint Conference on Neural Networks, Portland, USA (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Valentini, G. (2003). An Application of Low Bias Bagged SVMs to the Classification of Heterogeneous Malignant Tissues. In: Apolloni, B., Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2003. Lecture Notes in Computer Science, vol 2859. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45216-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45216-4_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20227-1

  • Online ISBN: 978-3-540-45216-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics