Skip to main content
Log in

Semi-supervised attribute reduction via attribute indiscernibility

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Attribute reduction based on rough sets plays an important role in data preprocessing. Discernibility pair, as an effective information measurement, has received extensive attention in attribute reduction. Unfortunately, the existing attribute importance measurement strategies based on discernibility pairs do not apply well to partially labeled data. Meanwhile, most of the existing attribute reduction algorithms focus on the relationships between objects and neglect the relationships between attributes, which may bring highly redundant attributes. Under the background of rough set theory, this paper studies the issue of semi-supervised attribute reduction, i.e. attribute reduction for partially labeled data. Firstly, we introduce the concept of discernibility pair based on object indiscernibility and propose a semi-supervised attribute reduction algorithm via the maximum discernibility pair by combining supervised and unsupervised discernibility pair strategies. Secondly, considering the relationships between attributes, we put forward new methods to define the similarity and distinction between attributes by discernibility pairs. Thirdly, we propose a semi-supervised attribute reduction algorithm by indiscernible attribute classes. Finally, comparative experiments indicate that the proposed algorithms are effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356

    Article  MATH  Google Scholar 

  2. Pawlak Z (1991) Rough sets—theoretical aspects of reasoning about data. Kluwer, Dordrecht

    MATH  Google Scholar 

  3. Zhang C, Dai J, Chen J (2020) Knowledge granularity based incremental attribute reduction for incomplete decision systems. Int J Mach Learn Cybern 11(5):1141–1157

    Article  Google Scholar 

  4. Wang C, He Q, Shao M, Hu Q (2018) Feature selection based on maximal neighborhood discernibility. Int J Mach Learn Cybern 9(11):1929–1940

    Article  Google Scholar 

  5. Chen D, Zhao S, Zhang L, Yang Y, Zhang X (2012) Sample pair selection for attribute reduction with rough set. IEEE Trans Knowl Data Eng 24(11):2080–2093

    Article  Google Scholar 

  6. Dai J, Hu H, Wu W, Qian Y, Huang D (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2174–2187

    Article  Google Scholar 

  7. Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221

    Article  Google Scholar 

  8. Wang J, Wei J, Yang Z, Wang S (2017) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841

    Article  Google Scholar 

  9. Susmaga R (2004) Reducts and constructs in attribute reduction. Fundam Inf 61(2):159–181

    MathSciNet  MATH  Google Scholar 

  10. Qin K, Jing S (2017) The attribute reductions based on indiscernibility and discernibility relations. Lect Notes Comput Sci 10313:306–316

    Article  MATH  Google Scholar 

  11. Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342

    Article  Google Scholar 

  12. Qian J, Miao D, Zhang ZH, Li W (2011) Hybrid approaches to attribute reduction based on indiscernibility and discernibility relation. Int J Approx Reason 52(2):212–230

    Article  MathSciNet  MATH  Google Scholar 

  13. Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312

    Article  Google Scholar 

  14. Rezaei M, Cribben I, Samorani M (2021) A clustering-based feature selection method for automatically generated relational attributes. Ann Oper Res 303(1):233–263

    Article  MathSciNet  MATH  Google Scholar 

  15. Restrepo M, Cornelis C (2021) Attribute reduction using functional dependency relations in rough set theory. Lect Notes Comput Sci 12872:90–96

    Article  MATH  Google Scholar 

  16. Jia X, Rao Y, Shang L, Li T (2020) Similarity-based attribute reduction in rough set theory: a clustering perspective. Int J Mach Learn Cybern 11(5):1047–1060

    Article  Google Scholar 

  17. Kudo Y, Murai T (2012) Indiscernibility relations by interrelationships between attributes in rough set data analysis. In: Proceedings of 2012 IEEE international conference on granular computing, Hangzhou, China, August 11–13, p 220–225

  18. Dai J, Liu Q (2022) Semi-supervised attribute reduction for interval data based on misclassification cost. Int J Mach Learn Cybern 13(6):1739–1750

    Article  Google Scholar 

  19. Dai J, Han H, Hu H, Hu Q, Zhang J, Wang W (2016) Dualpos: a semi-supervised attribute selection approach for symbolic data based on rough set theory. Lect Notes Comput Sci 9659:392–402

    Article  Google Scholar 

  20. Saha S, Alok AK, Ekbal A (2016) Use of semisupervised clustering and feature-selection techniques for identification of co-expressed genes. IEEE J Biomed Health Inform 20(4):1171–1177

    Article  Google Scholar 

  21. Chang X, Yang Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305

    Article  MathSciNet  Google Scholar 

  22. Mi Y, Quan P, Shi Y, Wang Z (2022) Concept-cognitive computing system for dynamic classification. Eur J Oper Res 301(1):287–299

    Article  MathSciNet  MATH  Google Scholar 

  23. Xu J, Tang B, He H, Man H (2017) Semisupervised feature selection based on relevance and redundancy criteria. IEEE Trans Neural Netw Learn Syst 28(9):1974–1984

    Article  MathSciNet  Google Scholar 

  24. Mi Y, Liu W, Shi Y, Li J (2022) Semi-supervised concept learning by concept-cognitive learning and concept space. IEEE Trans Knowl Data Eng 34(5):2429–2442

    Article  Google Scholar 

  25. Dai J, Hu Q, Zhang J, Hu H, Zheng N (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Trans Cybern 47(9):2460–2471

    Article  Google Scholar 

  26. Li B, Xiao J, Wang X (2019) Feature selection for partially labeled data based on neighborhood granulation measures. IEEE Access 7:37238–37250

    Article  Google Scholar 

  27. Dua D, Graff C (2019) UCI machine learning repository. http://archive.ics.uci.edu/ml

  28. Hu X, Cercone N (1995) Learning in relational databases: a rough set approach. Comput Intell 11:323–338

    Article  Google Scholar 

  29. Wang GY, Yu H, Yang D (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7):759–766

    MathSciNet  Google Scholar 

  30. Wang W (2014) Semi-supervised clustering and feature selection for symbolic data. MS thesis, Zhejiang University, Hangzhou, China

  31. Zhou P, Hu X, Li P, Wu X (2019) Online streaming feature selection using adapted neighborhood rough set. Inf Sci 481:258–279

    Article  Google Scholar 

  32. Xia S, Zhang H, Li W, Wang G, Giem E, Chen Z (2022) GBNRS: a novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Trans Knowl Data Eng 34(3):1231–1242

    Article  Google Scholar 

  33. Xia S, Wang C, Wang G, Gao X, Giem E, Yu J (2022) GBRS: an unified model of pawlak rough set and neighborhood rough set. arXiv e-prints

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61976089), the Major Program of the National Social Science Foundation of China (20&ZD047), the Natural Science Foundation of Hunan Province (2021JJ30451, 2022JJ30397), and the Hunan Provincial Science & Technology Project Foundation (2018RS 3065, 2018TP1018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaojun Qu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, J., Wang, W., Zhang, C. et al. Semi-supervised attribute reduction via attribute indiscernibility. Int. J. Mach. Learn. & Cyber. 14, 1445–1464 (2023). https://doi.org/10.1007/s13042-022-01708-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01708-2

Keywords

Navigation