Abstract
Decision tree is a classification technique which is widely used in many real world applications. It suffers from few challenges like structural instability, overfitting, curse of dimensionality etc. To address some of these issues, vertical partitioning paradigm is used in the literature. In vertical partitioning paradigm, the feature set is split into multiple subsets and these subsets are used for subsequent processing instead of original feature set. In this paper, we propose a novel partitioning approach using highest-size frequent itemsets. The efficiency of the method is evaluated using 5 standard datasets from UCI repository. The proposed method achieves significant improvement in classification accuracy and demonstrates better or competitive structural stability as compared to classical decision tree methods. The statistical significance of the results obtained by the proposed method is demonstrated using t-test, wilcoxon signed rank and pearson correlation tests.
Keywords
- Vertical partitioning
- Frequent itemsets
- Decision tree
This is a preview of subscription content, access via your institution.
Buying options



References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, pp. 487–499 (1994)
Baralis, E., Cagliero, L., Garza, P.: Planning stock portfolios by means of weighted frequent itemsets. Expert Syst. Appl. 86, 1–17 (2017)
Breiman L: Classification and Regression Trees. Wadsworth Int. Group (1984)
C5.0: See5: An informal tutorial (1993). http://www.rulequest/see5-win.html
Chao, W., Junzheng, W.: Cloud-service decision tree classification for education platform. Cogn. Syst. Res. 52, 234–239 (2018)
Domadiya, N., Rao, U.P.: Privacy preserving distributed association rule mining approach on vertically partitioned healthcare data. Procedia Comput. Sci. 148, 303–312 (2019). The second international conference on intelligent computing in data sciences, ICDS2018
Farzanyar, Z., Cercone, N.: Efficient mining of frequent itemsets in social network data based on mapreduce framework. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013), pp. 1183–1188 (2013)
Gheondea-Eladi, A.: Patient decision aids: a content analysis based on a decision tree structure. BMC Med. Inform. Decis. Mak. 19(1), 137 (2019)
Guggari, S., Kadappa, V., Umadevi, V.: Non-sequential partitioning approaches to decision tree classifier. Future Comput. Inform. J. 3(2), 275–285 (2018)
Gupta, M., Mohanty, B.K.: Attribute partitioning in multiple attribute decision making problems for a decision with a purpose a fuzzy approach. J. Multi-Criteria Decis. Anal. 23(3–4), 160–170 (2016)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Hassan, M.M., Mokhtar, H.M.: Investigating autism etiology and heterogeneity by decision tree algorithm. Inform. Med. Unlocked 16, 100215 (2019)
Kadappa, V., Guggari, S., Negi, A.: Decision tree classifier using theme based partitioning. In: 2015 International Conference on Computing and Network Communications (CoCoNet), pp. 540–546 (2015)
Sun, L., Mu, W.-S., Qi, B., Zhou, Z.-J.: A new privacy-preserving proximal support vector machine for classification of vertically partitioned data. Int. J. Mach. Learn. Cybernet. 6(1), 109–118 (2014). https://doi.org/10.1007/s13042-014-0245-1
R: The R project for statistical computing (1993). http://www.r-project.org/
Recamonde-Mendoza, M., Bazzan, A.L.: Social choice in distributed classification tasks: dealing with vertically partitioned data. Inf. Sci. 332, 56–71 (2016)
Salzberg, S.L.: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Mach. Learn. 16(3), 235–240 (1994)
Säuberlich, F., Gaul, W.: Decision tree construction by association rules. In: Decker, R., Gaul, W. (eds.) Classification and Information Processing at the Turn of the Millennium, pp. 245–253 (2000)
Stanczyk, U.: Decision rule length as a basis for evaluation of attribute relevance. J. Intell. Fuzzy Syst. Appl. Eng. Technol. 24(3), 429–445 (2013)
Vanahalli, M.K., Patil, N.: Distributed mining of significant frequent colossal closed itemsets from long biological dataset. In: Intelligent Systems Design and Applications, pp. 891–902 (2020)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD 1997, pp. 283–286 (1997)
Zheng, J., Yang, S., Wang, X., Xia, X., Xiao, Y., Li, T.: A decision tree based road recognition approach using roadside fixed 3D LiDAR sensors. IEEE Access 7, 53878–53890 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Guggari, S., Kadappa, V., Umadevi, V. (2020). Frequent Itemsets Based Partitioning Approach to Decision Tree Classifier. In: B. R., P., Thenkanidiyoor, V., Prasath, R., Vanga, O. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2019. Lecture Notes in Computer Science(), vol 11987. Springer, Cham. https://doi.org/10.1007/978-3-030-66187-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-66187-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66186-1
Online ISBN: 978-3-030-66187-8
eBook Packages: Computer ScienceComputer Science (R0)