Advertisement

Parameter Inference of Cost-Sensitive Boosting Algorithms

  • Yanmin Sun
  • A. K. C. Wong
  • Yang Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3587)

Abstract

Several cost-sensitive boosting algorithms have been reported as effective methods in dealing with class imbalance problem. Misclassification costs, which reflect the different level of class identification importance, are integrated into the weight update formula of AdaBoost algorithm. Yet, it has been shown that the weight update parameter of AdaBoost is induced so as the training error can be reduced most rapidly. This is the most crucial step of AdaBoost in converting a weak learning algorithm into a strong one. However, most reported cost-sensitive boosting algorithms ignore such a property. In this paper, we come up with three versions of cost-sensitive AdaBoost algorithms where the parameters for sample weight updating are induced. Then, their identification abilities on the small classes are tested on four “real world” medical data sets taken from UCI Machine Learning Database based on F-measure. Our experimental results show that one of our proposed cost-sensitive AdaBoost algorithms is superior in achieving the best identification ability on the small class among all reported cost-sensitive boosting algorithms.

Keywords

Training Error Recognition Ability Positive Class Cost Item AdaBoost Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abe, N., Zadrozny, B., Langford, J.: An iterative method for multi-class costsensitive learning. In: Proceedings of the tenth ACN SIGKDD International Conference on Knowledge Discovery and Data MIning, Seattle, WA, August 2004, pp. 3–11 (2004)Google Scholar
  2. 2.
    Bradford, J., Kunz, C., Kohavi, R., Brunk, C., Brodley, C.E.: Pruning decision trees with misclassification costs. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 131–136. Springer, Heidelberg (1998)Google Scholar
  3. 3.
    Chan, P., Stolfo, S.: Toward scalable learning with non-uniform class and cost distributions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data MIning, New York, NY, August 1998, pp. 164–168 (1998)Google Scholar
  4. 4.
    Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, Washington, August 2001, pp. 973–978 (2001)Google Scholar
  5. 5.
    Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: Misclasification costsensitive boosting. In: Proc. of Sixth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, pp. 97–105 (1999)Google Scholar
  6. 6.
    Fawcett, T.E., Provost, F.: Adaptive fraud detection. Data Mining and Knowledge Discovery 1(3), 291–316 (1997)CrossRefGoogle Scholar
  7. 7.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an aplication to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Geibel, P., Wysotzki, F.: Perceptron based learning with example dependent and noisy costs. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning, pp. 218–226. AAAI Press / Mit Press (2003)Google Scholar
  9. 9.
    Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A machine learning library in C++. Tools with Artificial Intelligence. IEEE CS Press, Los Alamitos (1996)Google Scholar
  10. 10.
    Kubat, R., Holte, M., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998)CrossRefGoogle Scholar
  11. 11.
    Murph, P.M., Aha, D.W.: UCI Repository of Machine Learning Databases. Dept. of Information and Computer Science, Univ. of California, Irvine (1991)Google Scholar
  12. 12.
    Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)zbMATHCrossRefGoogle Scholar
  13. 13.
    Ting, K.M.: A comparative study of cost-sensitive boosting algorithms. In: Proceedings of the 17th International Conference on Machine Learning, Stanford University, CA, pp. 983–990 (2000)Google Scholar
  14. 14.
    Wang, Y., Wong, A.K.C.: From association to classification: Inference using weight of evidence. IEEE Trans. on Knowledge and Data Engineering 15(3), 764–767 (2003)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Wong, A.K.C., Wang, Y.: High order pattern discovery from discrete-valued data. IEEE Trans. on Knowledge and Data Engineering 9(6), 877–893 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Yanmin Sun
    • 1
  • A. K. C. Wong
    • 1
  • Yang Wang
    • 2
  1. 1.Pattern Analysis and Machine Intelligence LabUniversity of Waterloo 
  2. 2.Pattern Discovery Software Ltd 

Personalised recommendations