Abstract
The optimal reduct computation problem aims to obtain the best reduct out of all possible reducts of a given decision system. In the rough set literature, two optimality criteria exist for computing an optimal reduct: shortest length based and coarsest granular space based. The coarsest granular space-based optimal reduct has the ability to induce a better generalizable classification model. The \(A^*RSOR\) is an existing \(A^*\) search-based optimal reduct computation algorithm that uses the coarsest granular space as an optimality criterion. This article proposes an improved coarsest granularity-based optimal reduct approach \(MA^*\_RSOR\) through analyzing the search process’s behaviour in \(A^*RSOR\) algorithm. To minimize the search space utilization and arrive at an optimal reduct in less time, suitable modifications are incorporated using the domain knowledge of rough set theory. The relevance of \(MA^*\_RSOR\) is demonstrated through theoretical analysis and comparative experimental validation with state-of-the-art algorithms. The experimental results with benchmark data sets established that \(MA^*\_RSOR\) achieves significant computational time gain (\(49-99\%\)) and space reduction (\(37-96\%\)) over \(A^*RSOR\). The \(MA^*\_RSOR\) could induce classification models with significantly better classification accuracies than state-of-the-art shortest length-based optimal/near-optimal reduct computation algorithms. In addition, a coefficient of variation based \(CV_{\text {NonCore}}\) heuristic is proposed for predicting when the \(MA^*\_RSOR\) algorithm is appropriate to use. Experimental results validate the relevance of the heuristic as its prediction turned out correctly in 8 out of 10 data sets.
This is a preview of subscription content, access via your institution.


Availability of data and materials
Manuscript has no associated data.
Notes
Note: Without loss of generality, \(U/IND(\emptyset )\) is taken as \(\{U\}\). That means objects are not distinguishable in absence of any attributes information.
References
Abdi H (2010) Coefficient of variation. Encyclopedia Res Des 1:169–171
Arel-Bundock V (2012) Rdatasets: An archive of datasets distributed with R. https://vincentarelbundock.github.io/Rdatasets/datasets.html
Bar A, Kumar A, Sai Prasad P (2019) Finding optimal rough set reduct with A* search algorithm. International Conference on Pattern Recognition and Machine Intelligence( PReMI 2019), vol 11941. Springer, Lecture Notes in Computer Science, pp 317–327
Barron A, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760
Bazan JG, Szczuka M (2005) The rough set exploration system. In: Transactions on Rough Sets III. Springer, pp 37–56, https://www.mimuw.edu.pl/~szczuka/rses/get.html
Benouini R, Batioua I, Ezghari S et al (2020) Fast feature selection algorithm for neighborhood rough set model based on bucket and trie structures. Granular Comput 5:329–347
Chen D, Zhao S, Zhang L et al (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24(11):2080–2093
Chen Y, Zhu Q, Xu H (2015) Finding rough set reducts with fish swarm algorithm. Knowl-Based Syst 81:22–29
Choromański M, Grześ T, Hońko P (2020) Breadth search strategies for finding minimal reducts: towards hardware implementation. Neural Comput Appl 32:14801–14816
Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873
Dai J, Hu Q, Zhang J et al (2016) Attribute selection for partially labeled categorical data by rough set approach. IEEE Trans Cybernet 47(9):2460–2471
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Ferone A (2018) Feature selection based on composition of rough sets induced by feature granulation. Int J Approx Reason 101:276–292
Gao C, Lai Z, Zhou J et al (2018) Maximum decision entropy-based attribute reduction in decision-theoretic rough set model. Knowl-Based Syst 143:179–191
Geng Z, Zhu Q (2009) Rough set-based heuristic hybrid recognizer and its application in fault diagnosis. Expert Syst Appl 36(2):2711–2718
Grzymala-Busse JW (1992) LERS-a system for learning from examples based on rough sets. In: Intelligent decision support, vol 11. Springer, pp 3–18
Han J, Hu X, Lin TY (2004) Feature subset selection based on relative dependency between attributes. In: Neuromuscular junction. Handbook of experimental pharmacology, vol 3066. Springer, pp 176–185
Hart PE, Nilsson NJ, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybernet 4(2):100–107
Hart PE, Nilsson NJ, Raphael B (1972) Correction to a formal basis for the heuristic determination of minimum cost paths. SIGART Bull 37:28–29
Hu X (1995) Knowledge discovery in databases: an attribute-oriented rough set approach. PhD thesis, University of Regina Regina, Canada
Hu K, Diao L, Lu Y, et al (2000) A heuristic optimal reduct algorithm. In: International Conference on Intelligent Data Engineering and Automated Learning, vol 1983. Springer, pp 139–144
Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. In: Proceedings of the 2003 UK workshop on computational intelligence, vol 1. Springer, pp 15–22
Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471
Karpinski M, Schudy W (2011) Approximation schemes for the betweenness problem in tournaments and related ranking problems. In: Approximation, Randomization, and combinatorial optimization. Algorithms and Techniques, vol 6845. Springer, pp 277–288
Kumar A, Prasad PS (2020) Scalable fuzzy rough set reduct computation using fuzzy min-max neural network preprocessing. IEEE Trans Fuzzy Syst 28(5):953–964
Lavangnananda K, Chattanachot S (2017) Study of discretization methods in classification. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), vol 16774343. IEEE, pp 50–55
Li Y, Shiu SCK, Pal SK et al (2006) A rough set-based case-based reasoner for text categorization. Int J Approx Reason 41(2):229–255
Li F, Miao D, Pedrycz W (2017) Granular multi-label feature selection based on mutual information. Pattern Recogn 67:410–423
Li W, Jia X, Wang L et al (2019) Multi-objective attribute reduction in three-way decision-theoretic rough set model. Int J Approx Reason 105:327–341
Liang D, Pedrycz W, Liu D et al (2015) Three-way decisions based on decision-theoretic rough sets under linguistic assessment with the aid of group decision making. Appl Soft Comput 29:256–269
Liu H, Hussain F, Tan CL et al (2002) Discretization:an enabling technique. Data Min Knowl Disc 6(4):393–423
Mafarja MM, Mirjalili S (2019) Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft Comput 23(15):6249–6265
Mahajan P, Kandwal R, Vijay R (2012) Rough set approach in machine learning: a review. Int J Comput Appl 56(10):1–13
Mitra S, Mitra M, Chaudhuri BB (2006) A rough-set-based inference engine for ecg classification. IEEE Trans Instrum Meas 55(6):2198–2206
Moshkov M, Zielosko B (2011) Combinatorial machine learning: a rough set approach, vol 360. Springer Science & Business Media
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356
Pawlak Z (1991) Rough Sets: theoretical aspects of reasoning about data, vol 9. Springer Science & Business Media
Pawlak Z, Skowron A (2007) Rough sets and boolean reasoning. Inf Sci 177(1):41–73
Qian J, Liu C, Yue X (2019) Multigranulation sequential three-way decisions based on multiple thresholds. Int J Approx Reason 105:396–416
Raza MS, Qamar U (2018) Feature selection using rough set-based direct dependency calculation by avoiding the positive region. Int J Approx Reason 92:175–197
Rodríguez-Diez V, Martínez-Trinidad JF, Carrasco-Ochoa JA et al (2020) Minreduct: a new algorithm for computing the shortest reducts. Pattern Recogn Lett 138:177–184
Sai Prasad P, Rao CR (2011) Extensions to iquick reduct. In: International Workshop on Multi-disciplinary Trends in Artificial Intelligence MIWAI’11, vol 7080. Springer, pp 351–362
Shi Y, Huang Y, Wang C et al (2019) Attribute reduction based on the boolean matrix. Granular Comput 4(3):313–322
Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Intelligent decision support, vol 11. Springer, pp 331–362
Starzyk J, Nelson DE, Sturtz K (1999) Reduct generation in information systems. Bull Int Rough Set Soc 3(1/2):19–22
Starzyk JA, Nelson DE, Sturtz K (2000) A mathematical foundation for improved reduct generation in information systems. Knowl Inf Syst 2(2):131–146. https://doi.org/10.1007/s101150050007
Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849
Tan A, Li J, Lin Y et al (2015) Matrix-based set approximations and reductions in covering decision information systems. Int J Approx Reason 59:68–80
Vitányi PM, Li M (2000) Minimum description length induction, bayesianism, and kolmogorov complexity. IEEE Trans Inf Theory 46(2):446–464
Wang X, Yang J, Teng X et al (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471
Wang C, Shi Y, Fan X et al (2019) Attribute reduction based on k-nearest neighborhood rough sets. Int J Approx Reason 106:18–31
Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proccedings of the second annual join conference on infromation science, pp 186–189
Yao Y, Zhao Y, Wang J (2008) On reduct construction algorithms. Transactions on Computational Science II. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 100–117
Zhan J, Ali MI, Mehmood N (2017) On a novel uncertain soft set model: Z-soft fuzzy rough set model and corresponding decision making methods. Appl Soft Comput 56:446–457
Zhang K, Zhan J, Wu WZ (2020) Novel fuzzy rough set models and corresponding applications to multi-criteria decision-making. Fuzzy Sets Syst 383:92–126
Zhang J, Wang J, Li D, et al (2003) A new heuristic reduct algorithm base on rough sets theory. In: International Conference on Web-Age Information Management, Springer, pp 247–253
Zhao J, Jm Liang, Zn Dong et al (2020) Accelerating information entropy-based feature selection using rough set theory with classified nested equivalence classes. Pattern Recogn 107(107):517
Ziarko W (2002) Rough set approaches for discovery of rules and attribute dependencies. Handbook of data mining and knowledge discovery pp 328–338
Acknowledgements
This research acknowledges the financial support of UoH-IoE by MHRD (F11/9/2019-U3(A)). The first author acknowledges the support of the Senior Research Fellowship from the Council of Scientific and Industrial Research (CSIR) (09/414(1117)/2016-EMR-I), Ministry of Science and Technology, Government of India.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bar, A., Kumar, A. & Sai Prasad, P.S.V.S. Coarsest granularity-based optimal reduct using A* search. Granul. Comput. 8, 45–66 (2023). https://doi.org/10.1007/s41066-022-00313-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41066-022-00313-6
Keywords
- Feature selection
- Optimal reduct computation
- Rough set theory
- \(A^*\) search
- Granular computing