A low variance error boosting algorithm

Wang, Ching-Wei; Hunter, Andrew

doi:10.1007/s10489-009-0172-0

A low variance error boosting algorithm

Published: 21 February 2009

Volume 33, pages 357–369, (2010)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ching-Wei Wang¹ &
Andrew Hunter¹

181 Accesses
8 Citations
Explore all metrics

Abstract

This paper introduces a robust variant of AdaBoost, cw-AdaBoost, that uses weight perturbation to reduce variance error, and is particularly effective when dealing with data sets, such as microarray data, which have large numbers of features and small number of instances. The algorithm is compared with AdaBoost, Arcing and MultiBoost, using twelve gene expression datasets, using 10-fold cross validation. The new algorithm consistently achieves higher classification accuracy over all these datasets. In contrast to other AdaBoost variants, the algorithm is not susceptible to problems when a zero-error base classifier is encountered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci USA 99(10):6562–6566
Article MATH Google Scholar
Amit Y, Blanchard G (2001) Multiple randomized classifiers. Technical report, University of Chicago
Ali KM, Pazzani MJ (1996) Error reduction through learning multiple descriptions. Int J Mach Learn 24:173–202
Google Scholar
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Natl Acad Sci Cell Biol 96:6745–6750
Article Google Scholar
Armstrong SA, Staunton JE, Silverman LB, Pieters R (2002) MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet 30:41–47
Article Google Scholar
Ash AA, Michael BE, Davis RE, Ma C, Izidore SL, Andreas R (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503–511
Article Google Scholar
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Int J Mach Learn 36:105–139
Article Google Scholar
Breiman L (1996) Bias, variance, and arcing classifiers. Technical report 460, Statistics Department, UC Berkeley
Breiman L (1996) Bagging predictors. Int J Mach Learn 24:134–140
MathSciNet Google Scholar
Catherine LN, Mani DR, Rebecca AB, Pablo T (2003) Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res 63:1602–1607
Google Scholar
Dasgupta S, Long PM (2003) Boosting with diverse base classifiers. In: Proceedings of the conference on computational learning theory, pp 273–287
Dettling M (2004) BagBoosting for tumor classification with gene expression data. Bioinformatics 20(18):3583–3593
Article Google Scholar
Dinesh S, Phillip GF, Kenneth R, Donald GJ, Judith M (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209
Article Google Scholar
Domingo C, Watanabe O (2000) MadaBoost: A modification of AdaBoost. Technical reports on mathematical and computing sciences TR-C138
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth international conference on machine learning, San Francisco, pp 148–156
Freund Y (2001) An adaptive version of the boost by majority algorithm. Int J Mach Learn 43(3):293–318
Article MATH Google Scholar
Friedman J (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–368
Article MATH Google Scholar
Friedman JH, Hastie T, Tibshirani R (2000) Additive logistic regression: A statistical view of boosting. Ann Stat 28:337–374
Article MATH MathSciNet Google Scholar
Gavin JG, Roderick VJ, Li-Li H, Steven RG, Joshua EB, Sridhar R, William GR, David JS, Raphael B (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62:4963–4967
Google Scholar
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Article Google Scholar
Kohavi R, Wolpert D (1996) Bias plus variance decomposition for zero-one loss functions. In: Proceedings of the thirteenth international machine learning conference
Kuncheva LI (2005) Diversity in multiple classifier systems. Inf Fusion 6:3–4
Article Google Scholar
Long PM, Vega VB (2003) Boosting and microarray data. Int J Mach Learn 52:31–44
Article MATH Google Scholar
Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415:436–442
Article Google Scholar
Quinlan JR (1996) Bagging, boosting and c4.5. In: Proceedings of the thirteenth national conference on artificial intelligence, pp 725–730
Tan AC, Gilbert D (2003) Ensemble machine learning on gene expression data for cancer classification. Appl Bioinf 2:S75–S83
Google Scholar
Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):484–485
Article Google Scholar
Wang C-W (2006) New ensemble machine learning method for classification and prediction on gene expression data. In: Proceedings of the international conference of the IEEE engineering in medicine and biology society, vol 2, pp 3478–3481
Warmuth MK, Liao J, Ratsch G (2006) Totally corrective boosting algorithms that maximize the margin. In: Proceedings of the 23rd international conference on machine learning, vol 148, pp 1001–1008
Webb GI (2000) MultiBoosting: a technique for combining boosting and wagging. Int J Mach Learn 40:159–196
Article Google Scholar
Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG (2002) Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2):133–143
Article Google Scholar
Zembutsu H, Ohnishi Y, Tsunoda T, Furukawa Y, Katagiri T, Ueyama Y (2002) Genome-wide cDNA microarray screening to correlate gene expression profiles with sensitivity of 85 human cancer xenografts to anticancer drugs. Cancer Res 62(2):518–527
Google Scholar

Download references

Author information

Authors and Affiliations

University of Lincoln, Lincoln, LN6 7TS, UK
Ching-Wei Wang & Andrew Hunter

Authors

Ching-Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Hunter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ching-Wei Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, CW., Hunter, A. A low variance error boosting algorithm. Appl Intell 33, 357–369 (2010). https://doi.org/10.1007/s10489-009-0172-0

Download citation

Received: 19 June 2008
Accepted: 10 February 2009
Published: 21 February 2009
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10489-009-0172-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A low variance error boosting algorithm

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A comparative analysis of gradient boosting algorithms

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A low variance error boosting algorithm

Abstract

Access this article

Similar content being viewed by others

A random forest guided tour

A comparative analysis of gradient boosting algorithms

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation