Abstract
Rule-based classification is one of the most important topics in the field of data mining due to its wide applications. This article presents a novel rule-based classifier called RACER (Rule Aggregating ClassifiER) to improve the accuracy of data classification. RACER uses a specific rule representation that enables it to consider each instance in the training data as an initial rule, without spending any cost. In order to retrieve an applicable rule set, RACER tries to combine the initial rules together. If the combined rule has a better fitness value in comparison with the two input rules, RACER combines them together. We have used seventeen different datasets from UCI machine learning database repository to evaluate RACER’s capability in classifying various kinds of databases. Moreover, to assess RACER’s performance, we compared our results with some other well-known classifiers including CN.2, PART, C4.5 and SVM. Our experiments show that RACER is an effective classifier in various domains and has better average classification accuracy and understandability in comparison with other applied classifiers.
Similar content being viewed by others
References
Tseng V, Lee C (2009) Effective temporal data classification by integrating sequential pattern mining and probabilistic induction. Expert Syst Appl 36(5):9524–9532
Yang Y, Slattery S, Ghani R (2002) A study of approaches to hypertext categorization. J Intell Inf Syst 18(2):219–241
Ngai E, Xiu L, Chau D (2009) Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst Appl 36(22):2592–2602
Basiri J, Taghiyareh F (2012) An application of the CORER classifier on customer churn prediction. In: Telecommunications (IST), 2012 sixth international symposium on 2012 Nov 6. IEEE, pp 867–872
Siami M, Gholamian MR, Basiri J (2014) An application of locally linear model tree algorithm with combination of feature selection in credit scoring. Int J Syst Sci 45(10):2213–2222
Frank E, Witten I (1998) Generating accurate rule sets without global optimization. In: Fifteenth international conference on machine learning. pp 144–151
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283
Mastrogiannis N, Boutsinas B, Giannikos I (2009) A method for improving the accuracy of data mining classification algorithms. Comput Oper Res 36(10):2829–2839
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Quinlan J (1986) Induction of decision trees. Mach Learn 1(1):81–106
Siami M, Gholamian MR, Basiri J, Fathian M (20111) An application of locally linear model tree algorithm for predictive accuracy of credit scoring. In: International conference on model and data engineering 2011 Sep 28. Springer, Berlin Heidelberg, pp 133–142
Domingos P.M. (1996) Efficient specific-to-general rule induction. In: KDD. pp 319–322)
Basiri J, Taghiyareh F, Gazani S (2010) CORER: a new rule generator classifier. 13th IEEE CSE
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Eleventh conference on uncertainty in artificial intelligence. pp 338–345
Rumelhart D, Hinton G, Williams R (1985) Learning internal representations by error propagation. In: Parallel distributed processing: explorations in the microstructure of cognition, vol 1. pp 318–362
Zhang Y, Xie F, Huang D, Ji M (2010) Support vector classifier based on fuzzy c-means and Mahalanobis distance. J Intell Inf Syst 35(2):333–345
Quinlan R (2005) Data mining tools. <http://www.rulequest.com/see5-info.html>
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: IEEE international conference on data mining. pp 369–376
Wang K, Zhou S, He Y (2000) Growing decision trees on support-less association rules. In: Proceedings of the Sixth ACM SIGKDD international conference on knowledge discovery and data mining. pp 265–269
Dehuri S, Mall R (2006) Predictive and comprehensible rule discovery using a multi-objective genetic algorithm. Knowl-Based Syst 19:413–421
Ceci M, Appice A (2006) Spatial associative classification: propositional vs structural approach. J Intell Inf Syst 27(3):191–213
Flouvat F, De Marchi F, Petit JM (2010) A new classification of datasets for frequent itemsets. J Intell Inf Syst 34(1):1–19
Shaharanee INM, Hadzic F, Dillon TS (2011) Interestingness measures for association rules based on statistical validity. Knowl-Based Syst 24:386–392
Han J, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc, San Francisco, CA
De Jong KA, Spears WM, Gordon DF (1993) Using genetic algorithms for concept learning. Mach Learn 13(2):161–188
UCI machine learning repository. Available at: http://archive.ics.uci.edu/ml/
Basiri J, Taghiyareh F, Moshiri B (2010) A hybrid approach to predict churn. In: Services computing conference (APSCC), 2010 IEEE Asia-Pacific 2010 Dec 6. IEEE, pp. 485–491
Witten H, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann <http://www.cs.waikato.ac.nz/ml/weka>
Demsar J, Zupan B (2004)Orange: from experimental ma-chine learning to interactive data mining. (White paper) http://www.ailab.si/orange
Kianmehr K, Alhajj R (2008) CARSVM: a class association rule-based classification framework and its application to gene expression data. Artif Intell Med 44(1):7–25
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that there is no conflict of interest and that no financial support has been received that could have influenced its outcome.
Rights and permissions
About this article
Cite this article
Basiri, J., Taghiyareh, F. & Faili, H. RACER: accurate and efficient classification based on rule aggregation approach. Neural Comput & Applic 31, 895–908 (2019). https://doi.org/10.1007/s00521-017-3117-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-017-3117-2