Abstract
In this chapter, we consider the imbalanced learning problem. This problem means the task of binary classification on imbalanced data, in which nearly all the instances are labeled as one class, while far fewer instances are labeled as the other class, usually the more important class. Traditional machine learning methods seeking accurate performance over a full range of instances are not suitable to deal with this problem, since they tend to classify all the data into the majority class, usually the less important class. Moreover, many current methods have tried to utilize some intermediate factors, e.g. the distribution of the training set, the decision thresholds or the cost matrix, to impose a bias towards the important class. However, it remains uncertain whether these roundabout methods can improve the performance in a systematic way. In this chapter, we apply Biased Minimax Probability Machine, one of the special cases of Minimum Error Minimax Probability Machine to deal with the imbalanced learning tasks. Different from previous methods, this model achieves in a worst-case scenario to derive the biased classifier by directly controlling the classification accuracy on each class. More precisely, BMPM builds up an explicit connection between the classification accuracy and the bias, which thus provides a rigorous treatment on imbalanced data. We examine different models and compare BMPM with three other competitive methods, i.e. the Naive Bayesian classifier, the k-Nearest Neighbor method, and the decision tree method C4.5. The experimental results demonstrate the superiority of this model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Machine Learning 6: 37–66
Bradley A (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithm. Pattern Recognition 30(7): 1145–1159
Cardie C, Howe N (1997) Improving minority class prediction using case specific feature weights. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML-1997). San Francisco, CA: Morgan Kaufmann 57–65
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16: 321–357
Dorfman K, Berbaum D, Metz C (1992) Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Investigative Radiology 27: 723–731
Dori D, Liu W (1999) Sparse pixel vectorization: An algorithm and its performance evaluation. IEEE Trans. Pattern Analysis and Machine Intelligence 21: 202–215
Firschein O, Strat T (1996) RADIUS: Image understanding for imagery intelligence. San Francisco, CA: Morgan Kaufmann
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Machine Learning 29: 131–161
Grzymala-Busse JW, Goodwin LK, Zhang X (2003) Increasing sensitivity of preterm birth by changing rule strengths. Pattern Recognition Letters 24: 903–910
Huang K, King I, Lyu MR (2003) Discriminative training of Bayesian chow-liu tree multinet classifiers. In Proceedings of International Joint Conference on Neural Network (IJCNN-2003), Oregon, Portland, U.S.A. 1: 484–488
Huang K, King I, Lyu MR (2003) Finite mixture model of bound semi-naive Bayesian network classifier. In Proceedings of the International Conference on Artificial Neural Networks (ICANN-2003), Lecture Notes in Artificial Intelligence, Long paper. Heidelberg: Springer-Verlag 2714: 115–122
Jaakkola TS, Haussler D (1998) Exploiting generative models in discriminative classifiers. In Advances in Neural Information Processing Systems (NIPS)
Kohavi R (1995) A study of cross validation and bootstrap for accuracy estimation and model selection. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-1995). San Francisco, CA: Morgan Kaufmann 338–345
Kubat M, Holte R, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30(2–3): 195–215
Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: One-sided selection. In Proceedings of the Fourteenth International Conference on Machine Learning (ICML-1997). San Francisco, CA: Morgan Kaufmann 179–186
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2001) Minimax probability machine. In Advances in Neural Information Processing Systems (NIPS)
Lanckriet GRG, Ghaoui LE, Bhattacharyya C, Jordan MI (2002) A robust minimax approach to classification. Journal of Machine Learning Research 3: 555–582
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In Proceedings of National Conference on Artificial Intelligence 223–228
Lerner B, Lawrence ND (2001) A comparison of state-of-the-art classification techniques with application to cytogenetics. Neural Computing and Applications 10(1): 39–47
Lewis D, Catlett J (1994) Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the Eleventh International Conference on Machine Learning (ICML-1994). San Francisco, CA: Morgan Kaufmann 148–156
Lin C, Nevatia R (1998) Building detection and description from a single intensity image. Computer Vision and Image Understanding 72: 101–121
Ling C, Li C (1998) Data mining for direct marketing:problems and solutions. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-1998). Menlo Park, CA: AAAI Press 73–79
Liu W, Dori D (1997) A protocol for performance evaluation of line detection algorithms. Machine Vision and Application 9: 240–250
Maloof MA (2002) On machine learning, ROC analysis, statistical tests of significance. In Proceedings of the Sixteenth International Conference on Pattern Recognition. Los Alamitos, CA: IEEE Press 204–207
Maloof MA (2003) Learning when data sets are imbanlanced and when costs are unequal and unknown. In Proceedings of International Conference on Machine Learning (ICML-2003)
Maloof MA, Langley P, Binford TO, Nevatia R, Sage S (2003) Improved rooftop detection in aerial images with machine learning. Machine Learning 53: 157–191
Mcclish D (1989) Analyzing a portion of the ROC curve. Medical Decision Making 9(3): 190–195
Nugroho AS, Kuroyanagi S, Iwata A (2002) A solution for imbalanced training sets problem by combnet and its application on fog forecasting. IEICE TRANS. INF. & SYST, E85-D(7)
Provost F (2000) Learning from imbanlanced data sets. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000)
Provost F, Fawcett T (1997) Analysis and visulization of classifier performance: comparison under imprecise class and cost distributions. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press 43–48
Quinlan JR (1993) C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann Publishers
Schmidt P, Witte A (1988) Predicting Recidivism Using Survival Models. New York, NY: Spring-Verlag
Swets J (1988) Measureing the accuracy of diagnostic systems. Science 240: 1285–1293
Swets J, Pickett R (1982) Evaluation of Diagnoistic Systems: Methods from Signal Detection Theory. New York, NY: Springer-Verlag
Vapnik VN (1999) The Nature of Statistical Learning Theory. New York, NY: Springer-Verlag, 2nd edition
Woods K, Kegelmeyer Jr WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Tansactions on Pattern Analysis and Machine Intelligence 19(4): 405–410
Rights and permissions
Copyright information
© 2008 Zhejiang University Press, Hangzhou and Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
(2008). Extension I: BMPM for Imbalanced Learning. In: Machine Learning. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79452-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-79452-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79451-6
Online ISBN: 978-3-540-79452-3
eBook Packages: Computer ScienceComputer Science (R0)