Abstract
Traditionally, the shortest length has been used as the optimality criterion in rough set based optimal / near-optimal reduct computation. A more generalizable alternative to the optimal reduct computation approach was recently introduced, with the coarsest granular space as the optimality criterion. However, owing to exponential time complexity, it is not scalable to even moderate-sized data sets. This article investigates to formulate two near-optimal reduct computation alternatives for scaling comparatively larger data sets. The first algorithm employs a controlled A∗ search based strategy to find a near-optimal reduct while reducing both space utilization and computational time. Whereas, the second algorithm employs a greedy sequential backward elimination (SBE) strategy on the higher granular space attribute ordering for achieving coarsest granular space based near-optimal reduct. The comparative experimental study is conducted among the proposed approaches with the coarsest granular space based optimal reduct algorithm A∗RSOR and state-of-the-art shortest length based optimal and near-optimal reduct algorithms. The experimental study amply validates the relevance of the proposed approaches in obtaining near-optimal reduct with increased scalability and comparable or improved generalizable classification models induction.
Similar content being viewed by others
Notes
The granular space having the smallest cardinality is considered as coarsest granular space.
A simplified binary discernibility matrix obtained through repeated applications of absorption law
References
Pawlak Z (1982) Rough sets. International journal of computer & information sciences 11(5):341–356
Yao Y, Zhao Y, Wang J (2008) On reduct construction algorithms. In: Transactions on computational science II. https://doi.org/10.1007/11795131_43. Springer, pp 100–117
Xu B, Chen H, Zhu W, Zhu X (2013) Multi-objective cost-sensitive attribute reduction. In: 2013 Joint IFSA world congress and NAFIPS annual meeting (IFSA/NAFIPS). https://doi.org/10.1109/IFSA-NAFIPS.2013.6608602, IEEE, pp 1377–1381
Zhao H, Min F, Zhu W (2011) Test-cost-sensitive attribute reduction based on neighborhood rough set. In: 2011 IEEE International conference on granular computing, IEEE, pp 802–806
Inuiguchi M (2017) Attribute importance degrees corresponding to several kinds of attribute reduction in the setting of the classical rough sets. In: Fuzzy Sets, Rough Sets, Multisets and Clustering. https://doi.org/10.1007/978-3-319-47557-8_14, vol 671. Springer, pp 241–255
Zielosko B, Stańczyk U (2020) Reduct-based ranking of attributes. Procedia Computer Science 176:2576–2585. https://doi.org/10.1016/j.procs.2020.09.315
Bazan JG, Nguyen HS, Nguyen SH, Synak P, Wróblewski J (2000) Rough set algorithms in classification problem. In: Rough set methods and applications. https://doi.org/10.1007/978-3-7908-1840-6_3, vol 56. Springer, pp 49–88
Das A K, Sengupta S, Bhattacharyya S (2018) A group incremental feature selection for classification using rough set theory based genetic algorithm. Appl Soft Comput 65:400–411. https://doi.org/10.1016/j.asoc.2018.01.040
Jensen R, Tuson A, Shen Q (2014) Finding rough and fuzzy-rough set reducts with sat. Inf Sci 255:100–120. https://doi.org/10.1016/j.ins.2013.07.033
Kumar A, Prasad PSVSS (2020) Scalable fuzzy rough set reduct computation using fuzzy min?max neural network preprocessing. IEEE Trans Fuzzy Syst 28(5):953–964. https://doi.org/10.1109/TFUZZ.2020.2965899
Sai Prasad PSVS, Rao CR (2011) Extensions to iquickreduct. In: MIWAI. https://doi.org/10.1007/978-3-642-25725-4_31, vol 7080. Springer, pp 351–362
Barron A, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760
Vitányi PMB, Li M (2000) Minimum description length induction, bayesianism, and kolmogorov complexity. IEEE Transactions on information theory 46(2):446–464
Choromański M, Grześ T, Hońko P (2020) Breadth search strategies for finding minimal reducts: towards hardware implementation. Neural Computing & Applications, 32(18). https://doi.org/10.1007/s00521-020-04833-7
Komorowski J, Pawlak Z, Polkowski L, Skowron A (1999) Rough sets: A tutorial. Rough fuzzy hybridization: A new trend in decision-making, pp 3–98
Rodríguez-Diez V, Martínez-Trinidad JF, Carrasco-Ochoa JA, Lazo-Cortés MS, Olvera-López JA (2020) Minreduct: A new algorithm for computing the shortest reducts. Pattern Recogn Lett 138:177–184. https://doi.org/10.1016/j.patrec.2020.07.004
Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. Springer Netherlands. https://doi.org/10.1007/978-94-015-7975-9_21
Starzyk JA, Nelson DE, Sturtz K (2000) A mathematical foundation for improved reduct generation in information systems. Knowl Inf Syst 2(2):131–146. https://doi.org/10.1007/s101150050007https://doi.org/10.1007/s101150050007
Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proccedings of the second annual join conference on infromation science, vol 2, pp 186–189
Bar A, Kumar A, Prasad PSVSS (2019) Finding optimal rough set reduct with a∗ search algorithm. In: Lecture notes in computer science. https://doi.org/10.1007/978-3-030-34869-4_35, vol 11941. Springer International Publishing, pp 317–327
Bar A, Kumar A, Sai Prasad PSVS (2022) Coarsest granularity-based optimal reduct using a* search. Granular Computing, pp 1–22
H. AE, Hassanien AE, Suraj Z, Slezak D, Lingras P (2007) Rough computing: Theories, technologies and applications. IGI Global, Hershey, PA, USA. https://doi.org/10.4018/978-1-59904-552-8https://doi.org/10.4018/978-1-59904-552-8
Shehzad K (2011) Edisc: a class-tailored discretization technique for rule-based classification. IEEE Trans Knowl Data Eng 24(8):1435–1447. https://doi.org/10.1109/TKDE.2011.101
Pawalk Z (1991) Rough sets: theoretical aspects of reasoning about data. Dordrecht: Kluwer Academic Publishers, https://doi.org/10.1007/978-94-011-3534-4
Gao C, Lai Z, Zhou J, Zhao C, Miao D (2018) Maximum decision entropy-based attribute reduction in decision-theoretic rough set model. Knowl-Based Syst 143:179–191. https://doi.org/10.1016/j.knosys.2017.12.014
Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873. https://doi.org/10.1007/978-3-540-48061-7_16
Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches. IEEE Trans Knowl Data Eng 16(12):1457–1471. https://doi.org/10.1109/TKDE.2004.96
Han J, Hu X, Lin T Y (2004) Feature subset selection based on relative dependency between attributes. In: International conference on rough sets and current trends in computing, vol 3066, Springer, pp 176–185
Chen Y, Zhu Q, Xu H (2015) Finding rough set reducts with fish swarm algorithm. Knowl-Based Syst 81:22–29. https://doi.org/10.1016/j.knosys.2015.02.002
Nilsson NJ (2009) The quest for artificial intelligence. Cambridge University Press
Hart PE, Nilsson NJ, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics 4(2):100–107
Qian Y, Liang J, Pedrycz W, Dang C (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artificial intelligence 174(9-10):597–618. https://doi.org/10.1016/j.artint.2010.04.018
Jensen R, Shen Q (2003) Finding rough set reducts with ant colony optimization. In: Proceedings of UKCI-2003, vol 1, pp 15–22
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and pso. Pattern recognition letters 28(4):459–471. https://doi.org/10.1016/j.patrec.2006.09.003
Divya UV, Prasad PSVSS (2018) Hashing supported iterative mapreduce based scalable sbe reduct computation. In: International conference on distributed computing and internet technology. https://doi.org/10.1007/978-3-319-72344-0_13, vol 10722, Springer, pp 163–170
Jia X, Shang L, Zhou B, Yao Y (2016) Generalized attribute reduct in rough set theory. Knowl-Based Syst 91:204–218
Arel-Bundock V (2012) Rdatasets: An archive of datasets distributed with r. https://vincentarelbundock.github.io/Rdatasets/datasets.html
Dua D, Karra Taniskidou E (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. https://doi.org/http://archive.ics.uci.edu/ml
Kim HJ (2010) mdlp: Discretization using the minimum description length principle. https://rdrr.io/cran/discretization/
Bazan JG, Szczuka M (2005) The rough set exploration system. In: Transactions on rough sets III. https://www.mimuw.edu.pl/~szczuka/rses/. Springer, pp 37–56
Karpinski M, Schudy W (2011) Approximation schemes for the betweenness problem in tournaments and related ranking problems. In: Approximation, randomization, and combinatorial optimization. Algorithms and techniques. https://doi.org/10.1007/3-540-44666-4, vol 6845. Springer, pp 277–288
Acknowledgements
This research acknowledges the financial support of UoH-IoE by MHRD (F11/9/2019-U3(A)). The first author acknowledges the support of the Senior Research Fellowship from the Council of Scientific & Industrial Research (CSIR), Ministry of Science and Technology, Government of India.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bar, A., Prasad, P.S.V.S.S. Approaches for coarsest granularity based near-optimal reduct computation. Appl Intell 53, 4231–4256 (2023). https://doi.org/10.1007/s10489-022-03571-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03571-8