Boosting Inspired Process for Improving AUC

  • Victor S. Sheng
  • Rahul Tada
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6871)


Boosting is a general method of combining a set of classifiers in making final prediction. It is shown to be an effective approach to improve the predictive accuracy of a learning algorithm, but its impact on the ranking performance is unknown. This paper introduces the boosting algorithm AUCBoost, which is a generic algorithm to improve the ranking performance of learning algorithms. Unlike AdaBoost, AUCBoost uses the AUC, not the accuracy, of a classifier to calculate the weight of each training example for building next classifier. To simplify the computation of AUC of weighted instances in AUCBoost, we extend the standard formula for calculating AUC to be a weighted AUC formula (WAUC in short). This extension frees boosting from the resampling process and saves much computation time in the training process. Our experiment results show that the new boosting algorithm AUCBoost does improve ranking performance of AdaBoost when the base learning algorithm is the improved ranking favored decision tree C4.4 or naïve Bayes.


boosting AUCBoost AUC classification inductive learning decision tree naïve bayes data mining machine learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Elkan, C.: Boosting and Naïve Bayesian Learning, Technical Report No. CS97-557, University of California, SanDiego (1997)Google Scholar
  2. 2.
    Fayyad, U., Irani, K.: Multi-interval Discretization of Continuous-valued attributes for Classification Learning. In: Proceeding of Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Francisco (1993)Google Scholar
  3. 3.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Hand, D.J., Till, R.J.: A Simple Generalization of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45, 171–186 (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Hastie, T., Tibshirani, R., Friedman, J.: The Element of Statistic Learning; Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)zbMATHGoogle Scholar
  6. 6.
    Kohavi, R.: A Study of Cross Validation and Bootstrap for Accuracy Estimation and Model Selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)Google Scholar
  7. 7.
    Ling, C., Huang, J., Zhang, H.: AUC: a Statistically Consistent and more Discriminating Measure than Accuracy. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 329–341 (2003)Google Scholar
  8. 8.
    Margineantu, D.D., Dietterich, T.G.: Improved Class Probability Estimates from Decision Tree Models. In: Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B. (eds.) Nonlinear Estimation and Classification. Lecture Notes in Statistics, vol. 171, pp. 169–184. Springer, New York (2002)Google Scholar
  9. 9.
    Mitchell, T.: Machine Learning. The McGraw-Hill Companies, New York (1997)zbMATHGoogle Scholar
  10. 10.
    Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning DataBases. In: Department of ICS. University of California, Irvine (1997), Google Scholar
  11. 11.
    Provost, F.J., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52, 199–215 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1(1), 86–106 (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Victor S. Sheng
    • 1
  • Rahul Tada
    • 1
  1. 1.Department of Computer ScienceUniversity of Central ArkansasConwayUSA

Personalised recommendations