Machine Learning

, Volume 61, Issue 1, pp 71–103

The Synergy Between PAV and AdaBoost

Article

DOI: 10.1007/s10994-005-1123-6

Cite this article as:
Wilbur, W.J., Yeganova, L. & Kim, W. Mach Learn (2005) 61: 71. doi:10.1007/s10994-005-1123-6

Abstract

Schapire and Singer's improved version of AdaBoost for handling weak hypotheses with confidence rated predictions represents an important advance in the theory and practice of boosting. Its success results from a more efficient use of information in weak hypotheses during updating. Instead of simple binary voting a weak hypothesis is allowed to vote for or against a classification with a variable strength or confidence. The Pool Adjacent Violators (PAV) algorithm is a method for converting a score into a probability. We show how PAV may be applied to a weak hypothesis to yield a new weak hypothesis which is in a sense an ideal confidence rated prediction and that this leads to an optimal updating for AdaBoost. The result is a new algorithm which we term PAV-AdaBoost. We give several examples illustrating problems for which this new algorithm provides advantages in performance.

Keywords

boostingisotonic regressionconvergencedocument classificationk nearest neighbors
Download to read the full article text

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  1. 1.National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaU.S.A.