Boosting Threshold Classifiers for High– Dimensional Data in Functional Genomics

Lausser, Ludwig; Buchholz, Malte; Kestler, Hans A.

doi:10.1007/978-3-540-69939-2_15

Ludwig Lausser¹,
Malte Buchholz³ &
Hans A. Kestler^1,2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5064))

Included in the following conference series:

IAPR Workshop on Artificial Neural Networks in Pattern Recognition

1500 Accesses
2 Citations

Abstract

Diagnosis of disease based on the classification of DNA microarray gene expression profiles of clinical samples is a promising novel approach to improve the performance and accuracy of current routine diagnostic procedures. In many applications ensembles outperform single classifiers. In a clinical setting a combination of simple classification rules, such as single threshold classifiers on individual gene expression values, may provide valuable insights and facilitate the diagnostic process. A boosting algorithm can be used for building such decision rules by utilizing single threshold classifiers as base classifiers. AdaBoost can be seen as the predecessor of many boosting algorithms developed, unfortunately its performance degrades on high-dimensional data. Here we compare extensions of AdaBoost namely MultiBoost, MadaBoost and AdaBoost-VC in cross-validation experiments on noisy high-dimensional artifical and real data sets. The artifical data sets are so constructed, that features, which are relevant for the class distinction, can easily be read out. Our special interest is in the features the ensembles select for classification and how many of them are effectively related to the original class distinction.

Download to read the full chapter text

Chapter PDF

Boosting for high-dimensional two-class prediction

Article Open access 21 September 2015

On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models

Article 28 November 2017

The Influence of Multi-class Feature Selection on the Prediction of Diagnostic Phenotypes

Article 06 October 2017

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Article MATH Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)
Google Scholar
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002)
Article MATH MathSciNet Google Scholar
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2), 105–139 (1999)
Article Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2), 159–196 (2000)
Article Google Scholar
Domingo, C., Watanabe, O.: Madaboost: A modification of adaboost. In: COLT 2000: Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, pp. 180–189. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Domingo, C., Watanabe, O.: Experimental evaluation of an adaptive boosting by filtering algorithm. Technical Report C-139, Tokyo Institut of Technology Department of Mathematical and Computing Sciences, Tokyo, Japan (December 1999)
Google Scholar
Long, P.M., Vega, V.B.: Boosting and microarray data. Mach. Learn. 52(1-2), 31–44 (2003)
Article MATH Google Scholar
Vapnik, V.: Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics). Springer-Verlag New York, Inc., Secaucus (1982)
MATH Google Scholar
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.C., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Article Google Scholar
van ’t Veer, L.J., Dai, H., van de Vijver, M.J., He, Y.D., Hart, A.A., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)
Article Google Scholar
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96(12), 6745–6750 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Internal Medicine I, University Hospital Ulm, Germany
Ludwig Lausser & Hans A. Kestler
Institute of Neural Information Processing, University of Ulm, Germany
Hans A. Kestler
Internal Medicine, SP Gastroenterology, University Hospital Marburg, Germany
Malte Buchholz

Authors

Ludwig Lausser
View author publications
You can also search for this author in PubMed Google Scholar
Malte Buchholz
View author publications
You can also search for this author in PubMed Google Scholar
Hans A. Kestler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Lionel Prevost Simone Marinai Friedhelm Schwenker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lausser, L., Buchholz, M., Kestler, H.A. (2008). Boosting Threshold Classifiers for High– Dimensional Data in Functional Genomics. In: Prevost, L., Marinai, S., Schwenker, F. (eds) Artificial Neural Networks in Pattern Recognition. ANNPR 2008. Lecture Notes in Computer Science(), vol 5064. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69939-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-69939-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69938-5
Online ISBN: 978-3-540-69939-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Boosting Threshold Classifiers for High– Dimensional Data in Functional Genomics

Abstract

Chapter PDF

Similar content being viewed by others

Boosting for high-dimensional two-class prediction

On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models

The Influence of Multi-class Feature Selection on the Prediction of Diagnostic Phenotypes

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Boosting Threshold Classifiers for High– Dimensional Data in Functional Genomics

Abstract

Chapter PDF

Similar content being viewed by others

Boosting for high-dimensional two-class prediction

On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models

The Influence of Multi-class Feature Selection on the Prediction of Diagnostic Phenotypes

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation