Skip to main content

A Cost-Sensitive Based Approach for Improving Associative Classification on Imbalanced Datasets

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8556))

  • 2374 Accesses

Abstract

Associative classification is one of rule-based classifiers that has been applied in many real-world applications. Associative classifier is easily interpretable in terms of classification rules. However, there is room for improvement when associative classification applied for imbalanced classification task. Existing associative classification algorithms can be limited in their performance on highly imbalanced datasets in which the class of interest is the minority class. Our objective is to improve the accuracy of the associative classifier on highly imbalanced datasets. In this paper, an effective cost-sensitive rule ranking method, named (SSCR Statistically Significant Cost-sensitive Rules), is proposed to estimate risks of a rule in classifying unseen data. Risk of a statistically significant association rule is estimated based on its classification cost induced from the training data. SSCR combines statistically significant association rules with cost-sensitive learning to build an associative classifier. Experimental results show that SSCR achieves best performance in terms of true positive rate and recall on real-world imbalanced datasets, compared with CBA and C4.5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Weiss, G.M.: Mining with Rarity: A Unifying Framework. Sigkdd Explorations 6(1), 7–19 (2004)

    Article  Google Scholar 

  2. Liu, B., Hsu, W., Ma, Y.: Integrating Classification and Association Rule Mining. In: KDD, pp. 80–86 (1998)

    Google Scholar 

  3. Li, W., Han, J., Pei, J.: CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules. In: ICDM, pp. 369–376 (2001)

    Google Scholar 

  4. Yin, X., Han, J.: CPAR: Classification based on Predictive Association Rules. In: SDM (2003)

    Google Scholar 

  5. Verhein, F., Chawla, S.: Using Significant, Positively Associated and Relatively Class Correlated Rules for Associative Classification of Imbalanced Datasets. In: ICDM, pp. 679–684 (2007)

    Google Scholar 

  6. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: VLDB, pp. 487–499 (1994)

    Google Scholar 

  7. Webb, G.I.: Discovering significant rules. In: KDD, pp. 434–443 (2006)

    Google Scholar 

  8. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explorations, 10–18 (2009)

    Google Scholar 

  9. Quinlan, J.R.: C4.5: Programs for Machine Learning

    Google Scholar 

  10. Tan, P., Steinbach, M., Kumar, V.: Introduction to Data Mining

    Google Scholar 

  11. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007)

    Google Scholar 

  12. Chai, X., Deng, L., Yang, Q., Ling, C.X.: Test-Cost Sensitive Naïve Bayesian Classification. In: Proceedings of the Fourth IEEE International Conference on Data Mining. IEEE Computer Society Press, Brighton (2004)

    Google Scholar 

  13. Domingos, P.: MetaCost: A general method for making classifiers cost-sensitive. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 155–164. ACM Press (1999)

    Google Scholar 

  14. Sheng, V.S., Ling, C.X.: Thresholding for Making Classifiers Cost-sensitive. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, Massachusetts, July 16-20, pp. 476–481 (2006)

    Google Scholar 

  15. Japkowicz, N.: The Class Imbalance Problem: Significance and Strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI 2000): Special Track on Inductive Learning, Las Vegas, Nevada (2000)

    Google Scholar 

  16. Solberg, A., Solberg, R.: A Large-Scale Evaluation of Features for Automatic Detection of Oil Spills in ERS SAR Images. In: International Geoscience and Remote Sensing Symposium, Lincoln, NE, pp. 1484–1486 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Waiyamai, K., Suwannarattaphoom, P. (2014). A Cost-Sensitive Based Approach for Improving Associative Classification on Imbalanced Datasets. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08979-9_3

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08978-2

  • Online ISBN: 978-3-319-08979-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics