Boosting Based on a Smooth Margin
We study two boosting algorithms, Coordinate Ascent Boosting and Approximate Coordinate Ascent Boosting, which are explicitly designed to produce maximum margins. To derive these algorithms, we introduce a smooth approximation of the margin that one can maximize in order to produce a maximum margin classifier. Our first algorithm is simply coordinate ascent on this function, involving a line search at each step. We then make a simple approximation of this line search to reveal our second algorithm. These algorithms are proven to asymptotically achieve maximum margins, and we provide two convergence rate calculations. The second calculation yields a faster rate of convergence than the first, although the first gives a more explicit (still fast) rate. These algorithms are very similar to AdaBoost in that they are based on coordinate ascent, easy to implement, and empirically tend to converge faster than other boosting algorithms. Finally, we attempt to understand AdaBoost in terms of our smooth margin, focusing on cases where AdaBoost exhibits cyclic behavior.
KeywordsConvergence Rate Line Search Cyclic Behavior Neural Information Processing System Maximum Margin
Unable to display preview. Download preview PDF.
- 1.Breiman, L.: Arcing the edge. Technical Report 486, Statistics Department, University of California at Berkeley (1997) Google Scholar
- 3.Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, AdaBoost and Bregman distances. Machine Learning 48(1/2/3) (2002)Google Scholar
- 5.Grove, A.J., Schuurmans, D.: Boosting in the limit: Maximizing the margin of learned ensembles. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (1998)Google Scholar
- 6.Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics 30(1) (February 2002)Google Scholar
- 7.Mason, L., Bartlett, P., Baxter, J.: Direct optimization of margins improves generalization in combined classifiers. Advances in Neural Information Processing Systems 12 (2000)Google Scholar
- 8.Meir, R., Rätsch, G.: An introduction to boosting and leveraging. In: Mendelson, S., Smola, A. (eds.) Advanced Lectures on Machine Learning, pp. 119–184. Springer, Heidelberg (2003)Google Scholar
- 9.Rätsch, G., Warmuth, M.: Efficient margin maximizing with boosting ( 2002) (submitted) Google Scholar
- 10.Rosset, S., Zhu, J., Hastie, T.: Boosting as a regularized path to a maximum margin classifier. Technical report, Department of Statistics, Stanford University (2003) Google Scholar
- 11.Rudin, C., Daubechies, I., Schapire, R,E.:The dynamics of AdaBoost: Cyclic behavior and convergence of margins (2004) (submitted) Google Scholar
- 12.Rudin, C., Daubechies, I., Schapire, R.E.: On the dynamics of boosting. In: Advances in Neural Information Processing Systems 16 (2004)Google Scholar
- 13.Schapire, R.E.: The boosting approach to machine learning: An overview. In: MSRI Workshop on Nonlinear Estimation and Classification (2002)Google Scholar
- 15.Zhang, T., Yu, B.: Boosting with early stopping: convergence and consistency. Technical Report 635, Department of Statistics, UC Berkeley (2003)Google Scholar