Parameter Inference of Cost-Sensitive Boosting Algorithms
Several cost-sensitive boosting algorithms have been reported as effective methods in dealing with class imbalance problem. Misclassification costs, which reflect the different level of class identification importance, are integrated into the weight update formula of AdaBoost algorithm. Yet, it has been shown that the weight update parameter of AdaBoost is induced so as the training error can be reduced most rapidly. This is the most crucial step of AdaBoost in converting a weak learning algorithm into a strong one. However, most reported cost-sensitive boosting algorithms ignore such a property. In this paper, we come up with three versions of cost-sensitive AdaBoost algorithms where the parameters for sample weight updating are induced. Then, their identification abilities on the small classes are tested on four “real world” medical data sets taken from UCI Machine Learning Database based on F-measure. Our experimental results show that one of our proposed cost-sensitive AdaBoost algorithms is superior in achieving the best identification ability on the small class among all reported cost-sensitive boosting algorithms.
KeywordsTraining Error Recognition Ability Positive Class Cost Item AdaBoost Algorithm
Unable to display preview. Download preview PDF.
- 1.Abe, N., Zadrozny, B., Langford, J.: An iterative method for multi-class costsensitive learning. In: Proceedings of the tenth ACN SIGKDD International Conference on Knowledge Discovery and Data MIning, Seattle, WA, August 2004, pp. 3–11 (2004)Google Scholar
- 2.Bradford, J., Kunz, C., Kohavi, R., Brunk, C., Brodley, C.E.: Pruning decision trees with misclassification costs. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 131–136. Springer, Heidelberg (1998)Google Scholar
- 3.Chan, P., Stolfo, S.: Toward scalable learning with non-uniform class and cost distributions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data MIning, New York, NY, August 1998, pp. 164–168 (1998)Google Scholar
- 4.Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, Washington, August 2001, pp. 973–978 (2001)Google Scholar
- 5.Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: Misclasification costsensitive boosting. In: Proc. of Sixth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, pp. 97–105 (1999)Google Scholar
- 8.Geibel, P., Wysotzki, F.: Perceptron based learning with example dependent and noisy costs. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning, pp. 218–226. AAAI Press / Mit Press (2003)Google Scholar
- 9.Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A machine learning library in C++. Tools with Artificial Intelligence. IEEE CS Press, Los Alamitos (1996)Google Scholar
- 11.Murph, P.M., Aha, D.W.: UCI Repository of Machine Learning Databases. Dept. of Information and Computer Science, Univ. of California, Irvine (1991)Google Scholar
- 13.Ting, K.M.: A comparative study of cost-sensitive boosting algorithms. In: Proceedings of the 17th International Conference on Machine Learning, Stanford University, CA, pp. 983–990 (2000)Google Scholar