Irrelevant Feature and Rule Removal for Structural Associative Classification Using Structure-Preserving Flat Representation

  • Izwan Nizal Mohd Shaharanee
  • Fedja Hadzic
Part of the Studies in Computational Intelligence book series (SCI, volume 584)


Practical applications of association rule mining often suffer from overwhelming number of rules that are generated, many of which are not interesting or useful for the application in question. Removing irrelevant features and/or rules comprised of irrelevant features can significantly improve the overall performance. Many statistical and constraint based measures are used to discard unnecessary and irrelevant features and rules when vectorial or tabular data is in question. In contrast, the use of such measures is limited in the tree-structured data domain, due to the structural aspects that are not easily incorporated. In this chapter, we explore the use of a feature subset selection measure as well as a number of common statistical interestingness measures via a recently proposed structure-preserving flat representation for tree-structured data such as XML. A feature subset selection is used prior to association rule generation. Once the initial set of rules is obtained, irrelevant rules are determined as those that are comprised of attributes not determined to be statistically significant for the classification task. The experiments are performed using real world web access trees and property management dataset. The results indicate that where the dataset has more standard structure a large number of insignificant rules will be discarded and accuracy will increase. However, where the tree instances can vary greatly in terms of structure and label distribution among nodes, while many rules are removed and the accuracy increases, there is a significant reduction in coverage rate of the rule set.


Tree-structured data Association rule based classification Feature subset selection Statistical interestingness 


  1. 1.
    Agrawal, R., Imieliski, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRefGoogle Scholar
  2. 2.
    Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. Intell. Inf. Syst. 20(3), 253–283 (2003)Google Scholar
  3. 3.
    Bathoorn, R., Koopman, A., Siebes, A.: Reducing the frequent pattern set. In: Proceedings of the 6th IEEE International Conference on Data Mining—Workshops, pp. 55–59 (2006)Google Scholar
  4. 4.
    Bayardo, R., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4(2–3), 217–240 (2000)CrossRefGoogle Scholar
  5. 5.
    Blanchard, J., Guillet, F., Gras, R., Briand, H.: Using information-theoretic measures to assess association rule interestingness. In: Proceedings of the 5th IEEE International Conference on Data Mining, pp. 215–238 (2005)Google Scholar
  6. 6.
    Bolon-Canedo, V., Sanchez-Marono, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)CrossRefGoogle Scholar
  7. 7.
    Brijs, T., Vanhoof, K., Wets, G.: Defining interestingness for association rules. Int. J. Inf. Theor. Appl. 10(4), 370–376 (2003)Google Scholar
  8. 8.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 265–276 (1997)Google Scholar
  9. 9.
    Cheng, H., Yan, X., Han, J., Hsu, C.W.: Discriminative frequent pattern analysis for effective classification. In: Proceedings of the 23rd International IEEE Conference on Data Engineering, pp. 716–725 (2007)Google Scholar
  10. 10.
    Cheng, H., Yan, X., Han, J., Yu, P.: Direct discriminative pattern mining for effective classification. In: Proceedings of the 24th International Conference on Data Engineering, pp. 167–178 (2008)Google Scholar
  11. 11.
    Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)CrossRefGoogle Scholar
  12. 12.
    Geng, L., Hamilton, H.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 338(3, Article No. 9) (2006)Google Scholar
  13. 13.
    Goodman, A., Kamath, C., Kumar, V.: Data analysis in the 21st century. Stat. Anal. Data Min. 1(1), 1–3 (2008)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Hadzic, F.: A structure preserving flat data format representation for tree-structured data. In: Proceedings of PAKDD Workshops, vol. 2011, pp. 221–233 (2012)Google Scholar
  15. 15.
    Hadzic, F., Dillon, T.: Using the symmetrical tau (\( \tau \)) criterion for feature selection in decision tree and neural network learning. In: Proceedings of the 2nd SIAM Workshop on Feature Selection for Data Mining: Interfacing Machine Learning and Statistics (2006)Google Scholar
  16. 16.
    Hadzic, F., Hecker, M.: Alternative approach to tree-structured web log representation and mining. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 235–242 (2011)Google Scholar
  17. 17.
    Hadzic, F., Tan, H., Dillon, T.S.: Mining of Data With Complex Structures, 1st edn, Studies in Computational Intelligence, vol. 333, . Springer (2011)Google Scholar
  18. 18.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2001)Google Scholar
  19. 19.
    Hashimoto, K., Takigawa, I., Shiga, M., Kanehisa, M., Mamitsuka, H.: Mining significant tree patterns in carbohydrate sugar chains. Bioinformatics 24(16), 167–173 (2008)CrossRefGoogle Scholar
  20. 20.
    Knijf, J.D., Feelders, A.J.: Monotone constraints in frequent tree mining. In: Proceedings of the 14th Annual Machine Learning Conference of Belgium and the Netherlands, BENELEARN pp. 13–20 (2005)Google Scholar
  21. 21.
    Kudo, M., Sklansky, J.: Comparison of algorithms that select features for pattern classifiers. Pattern Recognit. 33(1), 25–41 (2000)CrossRefGoogle Scholar
  22. 22.
    Lallich, S., Teytaud, O., Prudhomme, E.: Association rule interestingness: measure and statistical validation. In: Quality Measures in Data Mining. Studies in Computational Intelligence, vol. 43, pp. 251–275. Springer (2007)Google Scholar
  23. 23.
    Lallich, S., Teytaud, O., Prudhomme, E.: Formal framework for the study of algorithmic properties of objective interestingness measures. In: Data Mining: Foundations and Intelligent Paradigms, vol. 24, pp. 77–98. ISRL (2012)Google Scholar
  24. 24.
    Le Bras, Y., Lenca, P., Lallich, S.: Mining classification rules without support: an anti-monotone property of Jaccard measure. In: Proceedings of the 14th International Conference on Discovery Science, pp. 179–193 (2011)Google Scholar
  25. 25.
    Lenca, P., Meyer, P., Vaillant, B., Lallich, S.: On selecting interestingness measures for association rules: user oriented description and multiple criteria decision aid. Eur. J. Oper. Res. 184(2), 610–626 (2008)CrossRefMATHGoogle Scholar
  26. 26.
    Li, J., Shen, H., Topor, R.: Mining the optimal class association rule set. Knowl.-Based Syst. 15(7), 399–405 (2002)CrossRefGoogle Scholar
  27. 27.
    Little, R., Rubin, D.: Statistical Analysis with Missing Data, 2nd edn. Wiley, New York (2002)CrossRefMATHGoogle Scholar
  28. 28.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–86 (1998)Google Scholar
  29. 29.
    McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20(1), 39–61 (2005)CrossRefGoogle Scholar
  30. 30.
    Molina, L., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation. In: Proceedings of IEEE International Conference on Data Mining, pp. 306–313 (2002)Google Scholar
  31. 31.
    Nakamura, A., Kudo, M.: Mining frequent trees with node-inclusion constraints. In: Advances in Knowledge Discovery and Data Mining, vol. 3518, pp. 850–860. Springer (2005)Google Scholar
  32. 32.
    Ozaki, T., Ohkawa, T.: New frontiers in applied data mining, PAKDD 2008 International Workshops. Mining Mutually Dependent Ordered Subtrees in Tree Databases, pp. 75–86. Springer, Heidelberg (2009)Google Scholar
  33. 33.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  34. 34.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufman (1993)Google Scholar
  35. 35.
    Refaat, M.: Data Preparation for Data Mining Using SAS. Morgan Kaufmann Publishers, San Francisco (2007)Google Scholar
  36. 36.
    Roiger, R., Geatz, M.: Data Mining: A Tutorial-Based Primer. Addison Wesley, Boston (2003)Google Scholar
  37. 37.
    Shaharanee, I., Hadzic, F.: Evaluation and optimization of frequent, closed and maximal association rule based classification. Stat. Comput. 23, 1–23 (2013)CrossRefMathSciNetGoogle Scholar
  38. 38.
    Shaharanee, I., Hadzic, F., Dillon, T.: Interestingness measures for association rules based on statistical validity. Knowl.-Based Syst. 24(3), 386–392 (2011)CrossRefGoogle Scholar
  39. 39.
    Siebes, A., Vreeken, J., Leeuwen, M.V.: Item sets that compress. In: Proceedings of the SIAM Conference on Data Mining, pp. 393–404 (2006)Google Scholar
  40. 40.
    Silverstein, C., Brin, S., Motwani, R.: Beyond market baskets: generalizing association rules to dependence rules. Data Min. Knowl. Disc. 2(1), 39–68 (1998)CrossRefGoogle Scholar
  41. 41.
    Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Proceedings of the 3rd Internationall Conference on Knowledge Discovery in Databases and Data Mining, pp. 67–73 (1997)Google Scholar
  42. 42.
    Tan, H., Dillon, T., Hadzic, F., Feng, L., Chang, E.: IMB3-Miner: Mining induced/embedded subtrees by constraining the level of embedding. In: Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 450–461 (2006)Google Scholar
  43. 43.
    Tan, H., Hadzic, F., Dillon, T., Chang, E., Feng, L.: Tree model guided candidate generation for mining frequent subtrees from XML documents. ACM Trans. Knowl. Disc. Data Min. 2(2), 1–43 (2008)CrossRefGoogle Scholar
  44. 44.
    Tan, P., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM Knowledge Discovery and Data Mining Conference, pp. 32–41 (2002)Google Scholar
  45. 45.
    Veloso, A., Meira, W., Zaki, M.: Lazy Associative classification. In: Proceedings of the 6th IEEE International Conference on Data Mining, pp. 645–654 (2006)Google Scholar
  46. 46.
    Webb, G.: Discovering significant patterns. Mach. Learn. 68(1), 1–33 (2007)CrossRefGoogle Scholar
  47. 47.
    Xiong, H., Tan, P.N., Kumar, V.: Hyperclique pattern discovery. Data Min. Knowl. Disc. 13(2), 219–242 (2006)CrossRefMathSciNetGoogle Scholar
  48. 48.
    Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining significant graph patterns by leap search. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 433–444 (2008)Google Scholar
  49. 49.
    Yan, X., Han, J., Hsu, C.W.: Discrimantive frequent pattern analysis for effective classification. In: Proceedings of the 23rd IEEE International Conference on Data Engineering, pp. 716–725 (2007)Google Scholar
  50. 50.
    Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Proceedings of the SIAM International Conference on Data Mining, pp. 396–376 (2003)Google Scholar
  51. 51.
    Zaki, M.: Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Trans. Knowl. Data Eng. 17(8), 1021–1035 (2005)CrossRefGoogle Scholar
  52. 52.
    Zaki, M.J., Aggarwal, C.: XRules: an effective structural classifier for XML data. In: Proceedings of the 9th ACM Knowledge Discovery and Data Mining Conference, pp. 316–325 (2003)Google Scholar
  53. 53.
    Zhang, C., Zhang, S.: Collecting quality data for database mining. In: AI 2001: Advances in Artificial Intelligence, Lecture Notes in Computer Science, vol. 2256, pp. 593–604. Springer (2001)Google Scholar
  54. 54.
    Zhou, X., Dillon, T.: A statistical-heuristic feature selection criterion for decision tree induction. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 834–841 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.School of Quantitative SciencesUniversiti Utara MalaysiaSintokMalaysia
  2. 2.Department of ComputingCurtin UniversityPerthAustralia

Personalised recommendations