Abstract
The classical ID3 decision tree algorithm cannot directly handle continuous data and has a poor classification effect. Moreover, most of the existing approaches use a single mechanism for node measurement, which is unfavorable for the construction of decision trees. In order to solve the above problems, we propose an improved ID3 algorithm (called DIGGI) based on variable precision neighborhood rough sets. First, we introduce the notions of variable precision neighborhood (VPN) equivalence relation and VPN equivalence granule, and probe some basic properties of these notions. Second, we give the model of VPN rough sets and propose two extended measures: the VPN information gain and the VPN Gini index. Finally, we construct a hybrid measure by using the VPN dependence to combine the above two extended measures, and adopt the VPN equivalence granule as the splitting rule of DIGGI. Experimental results show that DIGGI is effective and its accuracy is greatly improved compared with three traditional decision tree algorithms, the neighborhood decision tree (NDT) and variable precision neighborhood decision tree (VPNDT) proposed in the latest literature.
Similar content being viewed by others
References
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer
Myles AJ, Feudale RN, Liu Y, Woody NA, Brown SD (2004) An introduction to decision tree modeling. J Chemometr: J Chemometr Soc 18(6):275–285
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern Syst 21(3):660–674
Breiman L (2017) Classification and regression trees. Routledge
Quinlan JR (1996) Learning decision tree classifiers. ACM Comput Surv 28(1):71–72
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier
Hssina B, Merbouha A, Ezzikouri H, Erritali M (2014) A comparative study of decision tree ID3 and C4.5. Int J Adv Comput Sci Appl 4(2):13–19
Ruggieri S (2002) Efficient C4. 5 [classification algorithm]. IEEE Trans Knowl Data Eng 14(2):438–444
Liang X, Qu F, Yang Y, Cai H (2015) An improved ID3 decision tree algorithm based on attribute weighted. In 2nd International Conference on Civil, Materials and Environmental Sciences (pp. 613-615). Atlantis Press
Wang S, Jia Z, Cao N (2022) Research on optimization and application of Spark decision tree algorithm under cloud-edge collaboration. Int J Intell Syst 37(11):8833–8854
Panhalkar AR, Doye DD (2022) Optimization of decision trees using modified African buffalo algorithm. Journal of King Saud University-Computer and Information Sciences 34(8):4763–4772
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11:341–356
Wei JM, Wang SQ, Wang MY, You JP, Liu DY (2007) Rough set based approach for inducing decision trees. Knowl-Based Syst 20(8):695–702
Wang Z, Liu Y, Liu L (2017, December) A new way to choose splitting attribute in ID3 algorithm. In 2017 IEEE 2nd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) (pp. 659-663). IEEE
Hu Q, Yu D, Xie Z (2008) Neighborhood classifiers. Expert Syst Appl 34(2):866–876
Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Sun L, Zhang X, Xu J, Zhang S (2019) An attribute reduction method using neighborhood entropy measures in neighborhood rough sets. Entropy 21(2):155
Xie X, Zhang X, Yang J (2022) Improved ID3 Decision Tree Algorithm Induced by Neighborhood Equivalence Relations. Comput Appl Res 39(01):102–105+112 (In Chinese with English Abstract)
Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46(1):39–59
Zakowski W (1983) Approximations in the space (u,\(\pi \)). Demonstratio mathematica 16(3):761–770
Chen Y, Chen Y (2021) Feature subset selection based on variable precision neighborhood rough sets. Int J Comput Intell Syst 14(1):572–581
Xie X, Zhang X, Wang X, Tang P (2022) Neighborhood Decision Tree Construction Algorithm of Variable Precision Neighborhood equivalent Kernel. J Comput Appl 42(02):382–388 ((In Chinese with English Abstract))
Liu C, Lin B, Lai J, Miao D (2022) An improved decision tree algorithm based on variable precision neighborhood similarity. Inf Sci 615:152–166
Xie X, Zhang X, Yang J (2022) Decision Tree algorithm combining information gain and Gini Index. Comput Eng Appl 58(10):139–144 (In Chinese with English Abstract)
Jain V, Phophalia A, Bhatt JS (2018)Investigation of a joint splitting criteria for decision tree classifier use of information gain and gini index. In TENCON 2018-2018 IEEE Region 10 Conference (pp. 2187-2192). IEEE
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26(3):297–302
Zhang X, Tang X, Yang J, Lv Z (2020) Quantitative three-way class-specific attribute reducts based on region preservations. Int J Approx Reason 117:96–121
Funding
This work is supported by National Natural Science Foundation of China under Grant Nos.: 62166001, 61976158, 62266003, Jiangxi Provincial Natural Science Foundation under Grant Nos: 20224BAB212022, and Science and Technology Project of Education Department of Jiangxi Province under Grant Nos: GJJ211435
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, C., Lai, J., Lin, B. et al. An improved ID3 algorithm based on variable precision neighborhood rough sets. Appl Intell 53, 23641–23654 (2023). https://doi.org/10.1007/s10489-023-04779-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04779-y