Skip to main content
Log in

A classification rule reduction algorithm based on significance domains

  • Original Paper
  • Published:
TOP Aims and scope Submit manuscript

Abstract

Many rule systems generated from decision trees (like CART, ID3, C4.5, etc.) or from direct counting frequency methods (like Apriori) are usually non-significant or even contradictory. Nevertheless, most papers on this subject demonstrate that important reductions can be made to generate rule sets by searching and removing redundancies and conflicts and simplifying the similarities between them. The objective of this paper is to present an algorithm (RBS: Reduction Based on Significance) for allocating a significance value to each rule in the system so that experts may select the rules that should be considered as preferable and understand the exact degree of correlation between the different rule attributes. Significance is calculated from the antecedent frequency and rule frequency parameters for each rule; if the first one is above the minimal level and rule frequency is in a critical interval, its significance ratio is computed by the algorithm. These critical boundaries are calculated by an incremental method and the rule space is divided according to them. The significance function is defined for these intervals. As with other methods of rule reduction, our approach can also be applied to rule sets generated from decision trees or frequency counting algorithms, in an independent way and after the rule set has been created. Three simulated data sets are used to carry out a computational experiment. Other standard data sets from UCI repository (UCI Machine Learning Repository) and two particular data sets with expert interpretation are used too, in order to obtain a greater consistency. The proposed method offers a more reduced and more easily understandable rule set than the original sets, and highlights the most significant attribute correlations quantifying their influence on consequent attribute.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Aggarwal CC, Yu PS (1998) A new framework for itemset generation. In: Proc 17th symposium on principles of database systems, pp 18–24

    Google Scholar 

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc 20th conference on very large database, pp 487–499

    Google Scholar 

  • Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proc ACM sigmod conference, pp 207–216

    Google Scholar 

  • Bayardo R, Agrawal R (1999) Mining the most interesting rules. In: Proc 5th ACM SIGKDD conference on knowledge discovery, pp 145–154

    Chapter  Google Scholar 

  • Brassard G, Bratley P (1995) Fundamentals of algoritmics. Pearson/Prentice Hall, New York

    Google Scholar 

  • Chen G, Liu H, Yu L, Wei Q, Zhang X (2006) A new approach to classification based on association rule mining. Decis Support Syst 42:674–689

    Article  Google Scholar 

  • Garey MR, Johnson DS (1979) A guide to the theory of NP-completeness. Freeman, New York

    Google Scholar 

  • Hand D, Mannila H, Smyth P (2001) Principles of data mining. MIT Press, Cambridge

    Google Scholar 

  • Li J, Cercone N (2006) Introducing a rule importance measure. In: Peters JF, Skowron A (eds) Transactions on rough sets V. Springer, Berlin, pp 167–189

    Chapter  Google Scholar 

  • Li Y, Kubat M (2006) Searching for high-support itemsets in itemset trees. Intell Data Anal 10(2):105–120

    Google Scholar 

  • McGarry K, Malone J (2004) Analysis of rules discovered by the data mining process. In: Proc conference on applications and science in soft computing, pp 219–224

    Chapter  Google Scholar 

  • Pawlak Z (2002) Rough sets and intelligent data analysis. Inf Sci 147:1–12

    Article  Google Scholar 

  • Piatetsky-Shapiro G (1991) Discovery, analysis and presentation of strong rules. In: Proc conference on knowledge discovery in data bases, pp 229–248

    Google Scholar 

  • Riquelme JC, Aguilar-Ruiz JS, Toro M (2003) Finding representative patterns with ordered projections. Pattern Recognit 36:1009–1018

    Article  Google Scholar 

  • Silverstein C, Brin S, Motwani R (1998) Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Discov 2:39–68

    Article  Google Scholar 

  • Tan S, Gu J (2004) An efficient rules induction algorithm for rough set classification. In: Suzuki E, Arikawa S (eds) Lecture notes in computer science, vol 3245. Springer, Berlin, pp 330–337

    Google Scholar 

  • Tan PN, Kumar V, Srivastava J (2004) Selecting right objective measure for association analysis. Inf Syst 29:293–313

    Article  Google Scholar 

  • Waitman L, Fisher DG, King PH (2005) Bootstrapping rule induction to achieve rule stability and reduction. Intell Inf Syst 27:49–77

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Rabasa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Almiñana, M., Escudero, L.F., Pérez-Martín, A. et al. A classification rule reduction algorithm based on significance domains. TOP 22, 397–418 (2014). https://doi.org/10.1007/s11750-012-0264-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11750-012-0264-6

Keywords

Mathematics Subject Classification

Navigation