Abstract
Associative classification is one of rule-based classifiers that has been applied in many real-world applications. Associative classifier is easily interpretable in terms of classification rules. However, there is room for improvement when associative classification applied for imbalanced classification task. Existing associative classification algorithms can be limited in their performance on highly imbalanced datasets in which the class of interest is the minority class. Our objective is to improve the accuracy of the associative classifier on highly imbalanced datasets. In this paper, an effective cost-sensitive rule ranking method, named (SSCR Statistically Significant Cost-sensitive Rules), is proposed to estimate risks of a rule in classifying unseen data. Risk of a statistically significant association rule is estimated based on its classification cost induced from the training data. SSCR combines statistically significant association rules with cost-sensitive learning to build an associative classifier. Experimental results show that SSCR achieves best performance in terms of true positive rate and recall on real-world imbalanced datasets, compared with CBA and C4.5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Weiss, G.M.: Mining with Rarity: A Unifying Framework. Sigkdd Explorations 6(1), 7–19 (2004)
Liu, B., Hsu, W., Ma, Y.: Integrating Classification and Association Rule Mining. In: KDD, pp. 80–86 (1998)
Li, W., Han, J., Pei, J.: CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. In: ICDM, pp. 369–376 (2001)
Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules. In: SDM (2003)
Verhein, F., Chawla, S.: Using Significant, Positively Associated and Relatively Class Correlated Rules for Associative Classification of Imbalanced Datasets. In: ICDM, pp. 679–684 (2007)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: VLDB, pp. 487–499 (1994)
Webb, G.I.: Discovering significant rules. In: KDD, pp. 434–443 (2006)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations, 10–18 (2009)
Quinlan, J.R.: C4.5: Programs for Machine Learning
Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007)
Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naïve Bayesian Classification. In: Proceedings of the Fourth IEEE International Conference on Data Mining. IEEE Computer Society Press, Brighton (2004)
Domingos, P.: MetaCost: A general method for making classifiers cost-sensitive. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155–164. ACM Press (1999)
Sheng, V.S., Ling, C.X.: Thresholding for Making Classifiers Cost-sensitive. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, Massachusetts, July 16-20, pp. 476–481 (2006)
Japkowicz, N.: The Class Imbalance Problem: Significance and Strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI 2000): Special Track on Inductive Learning, Las Vegas, Nevada (2000)
Solberg, A., Solberg, R.: A Large-Scale Evaluation of Features for Automatic Detection of Oil Spills in ERS SAR Images. In: International Geoscience and Remote Sensing Symposium, Lincoln, NE, pp. 1484–1486 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Waiyamai, K., Suwannarattaphoom, P. (2014). A Cost-Sensitive Based Approach for Improving Associative Classification on Imbalanced Datasets. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-08979-9_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08978-2
Online ISBN: 978-3-319-08979-9
eBook Packages: Computer ScienceComputer Science (R0)