Classification Using Association Rules: Weaknesses and Enhancements

Liu, Bing; Ma, Yiming; Wong, Ching-Kian

doi:10.1007/978-1-4615-1733-7_30

Bing Liu,
Yiming Ma &
Ching-Kian Wong

Part of the book series: Massive Computing ((MACO,volume 2))

529 Accesses
62 Citations

Abstract

Existing classification and rule learning algorithms in machine learning mainly use heuristic/greedy search to find a subset of regularities (e.g., a decision tree or a set of rules) in data for classification. In the past few years, extensive research was done in the database community on learning rules using exhaustive search under the name of association rule mining. The objective there is to find all rules in data that satisfy the user-specified minimum support and minimum confidence. Although the whole set of rules may not be used directly for accurate classification, effective and efficient classifiers have been built using the rules. This paper aims to improve such an exhaustive search based classification system CBA (Classification Based on Associations). The main strength of this system is that it is able to use the most accurate rules for classification. However, it also has weaknesses. This paper proposes two new techniques to deal with these weaknesses. This results in remarkably accurate classifiers. Experiments on a set of 34 benchmark datasets show that on average the new techniques reduce the error of CBA by 17% and is superior to CBA on 26 of the 34 datasets. They reduce the error of the decision tree classifier C4.5 by 19%, and improve performance on 29 datasets. Similar good results are also achieved against the existing classification systems, RIPPER, LB and a Naïve-Bayes classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, and R. Srikant. Fast Algorithms for Mining Association Rules. In Proceedings of VLDB-94, 1994.
Google Scholar
K. Ali, S. Manganaris and R. Srikant. Partial Classification Using Association Rules. In Proceedings of KDD-97, 115–118, 1997.
Google Scholar
K. Ali, and Pazzani M. Error Reduction through Learning Multiple Descriptions. Machine Learning, 24:3, 1996.
Google Scholar
Bayardo, R. J. Brute-force mining of high-confidence classification rules. In Proceedings of KDD-97, 1997.
Google Scholar
P. Chan, and J. S. Stolfo. Experiments on multistrategy learning by meta-learning. Proc. Second Intl. Conf. Info. Know. Manag., 314–323, 1993.
Google Scholar
P. Clark, and T. Niblitt. The CN2 Induction Algorithm. Machine Learning 3(1), 1989.
Google Scholar
W. Cohen. Fast Effective Rule Induction. In Proceedings of ICML-95, 1995.
Google Scholar
W. Cohen, and Y. Singer. A Simple, Fast, and Effective Rule Learner. In Proceedings of AAAI-99, 1999.
Google Scholar
P. Domingos, and M. Pazzani. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning, 29, 1997.
MATH Google Scholar
G. Dong, X. Zhang, L. Wong, and J. Li. CAEP: Classification by Aggregating Emerging Patterns. In Proceedings of Discovery-Science-99, 1999.
Google Scholar
J. Dougherty, R. Kohavi, and M. Sahami. Supervised and Unsupervised Discretization of Continuous Features. In Proceedings of ICML-95, 1995.
Google Scholar
R. Duda, and P. Hart. Pattern Classification and Scene Analysis. Wiley, 1973.
MATH Google Scholar
U. Fayyad, and K. Irani. Multi-interval Discretization of Continuous-valued Attributes for Classification Learning. In Proceedings of IJCAI-93, 1022–1027, 1993.
Google Scholar
Y. Freund, and R. Schapire. Experiments with a New Boosting Algorithm. In Proceedings of ICML-96, 1996.
Google Scholar
J. Furnkranz, and G. Widmer. Incremental Reduced Error Pruning. ICML-94, 1994.
Google Scholar
R. Kohavi. Scaling up the Accuracy of Naïve-Bayes Classifiers: A Decision-tree Hybrid. In Proceedings of KDD-96, 1996.
Google Scholar
R. Kohavi, G. John, R. Long, D. Manley, and K. Pfleger. MLC++: A Machine-learning Library in C++. Tools with artificial intelligence, 740–743, 1994.
Google Scholar
N. Littlestone, and M. Warmuth. The weighted majority algorithm. Tech. report, UCSC-CRL-89–16: UC. Santa Cruz, 1989.
MATH Google Scholar
B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. In Proceedings of KDD-98, 1998.
Google Scholar
B. Liu, W. Hsu, and Y. Ma. Mining Association Rules with Multiple Minimum Supports. In Proceedings of KDD-99, 1999.
Google Scholar
B. Liu, Y. Ma and C-K. Wong. Improving an Exhaustive Search Based Rule Learner Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2000), 2000.
Google Scholar
H. Lu, and H-Y. Liu. Decision Tables: Scalable Classification Exploring RDBMS Capabilities. VLDB-2000, 2000.
Google Scholar
D. Meretkis, and B. Wuthrich. Extending Naïve Bayes Classifiers Using Long Itemsets. In Proceedings of KDD-99, 1999.
Google Scholar
C. J. Merz, and P. Murphy. UCI Repository of Machine Learning Database. [http://www.cs.uci.edu/~mlearn], 1996.
Google Scholar
R. Michalski. Pattern Recognition as Rule-guided Induction Inference. IEEE action On Pattern Analysis and Machine Intelligence 2, 349–361, 1980.
Article MATH Google Scholar
P. Murphy and M. Pazzani. Exploring the Decision Forest: an Empirical Investigation of Occam’s Razor in Decision Tree Induction.J. of AI Research 1:257–275, 1994.
MATH Google Scholar
J. R. Quinlan. C4.5: Program for Machine Learning. Morgan Kaufmann, 1992.
Google Scholar
J. R. Quinlan. Combining Instance-based and Model-Based Learning. In Proceedings of ICML-94, 1994.
Google Scholar
R. Rymon. SE-tree Outperforms Decision Trees in Noisy Domains. In Proceedings of KDD-96, 331–336, 1996.
Google Scholar
K. Wang, S. Zhou, and Y. He. Growing Decision Trees on Support-less Association Rules. In Proceedings of KDD-2000, 2000.
Google Scholar
G. Webb. Systematic Search for Categorical Attribute-value Data-driven Machine Learning. In Proceedings of Australian conference on Artificial Intelligence, 1993.
Google Scholar
D. Wolpert. Stacked Generalization. Neural networks, 5:241–259, 1992.
Article Google Scholar
Z. Zheng and G. Webb. Stochastic Attribute Selection Committees with Multiple Boosting: Learning More Accurate and More Stable Classifier Committees. In Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-99), 1999.
Google Scholar

Download references

Authors

Bing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Ma
View author publications
You can also search for this author in PubMed Google Scholar
Ching-Kian Wong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Illinois, Chicago, USA
Robert L. Grossman
Lawrence Livermore National Laboratory, Livermore, USA
Chandrika Kamath
Sandia National Laboratories, Livermore, USA
Philip Kegelmeyer
Army High Performance Computing Research Center (AHPCRC), Minneapolis, USA
Vipin Kumar
Army Research Laboratory, Aberdeen Proving Ground, USA
Raju R. Namburu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Liu, B., Ma, Y., Wong, CK. (2001). Classification Using Association Rules: Weaknesses and Enhancements. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds) Data Mining for Scientific and Engineering Applications. Massive Computing, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1733-7_30

Download citation

DOI: https://doi.org/10.1007/978-1-4615-1733-7_30
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-0114-7
Online ISBN: 978-1-4615-1733-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics