Skip to main content
Log in

Rough sets-based tri-trade for partially labeled data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The theory of rough sets is one of the most representative models for handling supervised data entangled with vagueness, impreciseness, or uncertainty. However, little work has been devoted to learning from partially labeled data using rough sets. In this study, a rough sets-based tri-trade model is proposed for partially labeled data. More specifically, a new discernibility matrix that considers both labeled and unlabeled data is first proposed, based on which a beam search-based heuristic algorithm is provided to generate multiple semi-supervised reducts. Then, a tri-trade model using three diverse semi-supervised reducts is developed, in which a data editing technique is embedded to generate reliable pseudo-labels for unlabeled data to improve the tri-trade model. Both theoretical analysis and comparative experiments on the UCI datasets show that the proposed model can effectively utilize unlabeled data to improve generalization performance and compare favorably to other representative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

Data and code are available from the corresponding author upon reasonable request.

References

  1. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356

    Article  MATH  Google Scholar 

  2. Xu W, Yu J (2017) A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf Sci 378:410–423

    Article  MATH  Google Scholar 

  3. Chen X, Xu W (2022) Double-quantitative multigranulation rough fuzzy set based on logical operations in multi-source decision systems. Int J Mach Learn Cybern 13(4):1021–1048

    Article  Google Scholar 

  4. Xue Z, Zhang R, Qin C, Zeng X (2018) A rough ν-twin support vector regression machine. Appl Intell 48(11):4023–4046

    Article  Google Scholar 

  5. Sun L, Zhang X, Qian Y, Xu J, Zhang S, Tian Y (2019) Joint neighborhood entropy-based gene selection method with fisher score for tumor classification. Appl Intell 49(4):1245–1259

    Article  Google Scholar 

  6. Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publisher, Dordrecht

    Book  MATH  Google Scholar 

  7. Bai S, Lin Y, Lv Y, Chen J, Wang C (2021) Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Appl Intell 51(3):1602–1615

    Article  Google Scholar 

  8. Li Y, Cai M, Zhou J, Li Q (2022) Accelerated multi-granularity reduction based on neighborhood rough sets. Appl Intell 52(15):17636–17651

    Article  Google Scholar 

  9. Sun L, Zhang J, Ding W, Xu J (2022) Mixed measure-based feature selection using the fisher score and neighborhood rough sets. Appl Intell 52:17264–17288

    Article  Google Scholar 

  10. Wang Cz, Huang Y, Ding W, Cao Z (2021) Attribute reduction with fuzzy rough self-information measures. Inf Sci 549:68–86

    Article  MathSciNet  MATH  Google Scholar 

  11. Wang C, Qian Y, Ding W, Fan X (2022) Feature selection with fuzzy-rough minimum classification error criterion. IEEE Trans Fuzzy Syst 30(8):2930–2942

    Article  Google Scholar 

  12. Wang C, Huang Y, Shao M, Hu Q, Chen D (2020) Feature selection based on neighborhood self-information. IEEE Trans Fuzzy Syst 50(9):4031–4042

    Google Scholar 

  13. Zhang X, Yao Y (2022) Tri-level attribute reduction in rough set theory. Expert Syst Appl 190:116–187

    Article  Google Scholar 

  14. Zhang X, Jiang J (2022) Measurement, modeling, reduction of decision-theoretic multigranulation fuzzy rough sets based on three-way decisions. Inf Sci 607:1550–1582

    Article  Google Scholar 

  15. Yang X, Li M, Fujita H, Liu D, Li T (2022) Incremental rough reduction with stable attribute group. Inf Sci 589:283–299

    Article  Google Scholar 

  16. Liu K, Li T, Yang X, Ju H, Yang X, Liu D (2022) Hierarchical neighborhood entropy based multi-granularity attribute reduction with application to gene prioritization. Int J Approx Reason 148:57–67

    Article  MathSciNet  MATH  Google Scholar 

  17. Cai M, Lang G, Fujita H, Li Z, Yang T (2019) Incremental approaches to updating reducts under dynamic covering granularity. Knowl-Based Syst 172:130–140

    Article  Google Scholar 

  18. Yang X, Yang Y, Luo J, Liu D, Li T (2022) A unified incremental updating framework of attribute reduction for two-dimensionally time-evolving data. Inf Sci 601:287–305

    Article  Google Scholar 

  19. Yang X, Li Y, Liu D, Li T (2022) Hierarchical fuzzy rough approximations with three-way multi-granularity learning. IEEE Trans Fuzzy Syst 30(9):3486–3500

    Article  Google Scholar 

  20. Wei W, Wu X, Liang J, Cui J, Sun Y (2018) Discernibility matrix based incremental attribute reduction for dynamic data. Knowl-Based Syst 140:142–157

    Article  Google Scholar 

  21. Ma F, Ding M, Zhang T, Cao J (2019) Compressed binary discernibility matrix based incremental attribute reduction algorithm for group dynamic data. Neurocomputing 344:20–27

    Article  Google Scholar 

  22. Liu Y, Zheng L, Xiu Y, Yin H, Zhao S, Wang X, Chen H, Li C (2020) Discernibility matrix based incremental feature selection on fused decision tables. Int J Approx Reason 118:1–26

    Article  MathSciNet  MATH  Google Scholar 

  23. Gao C, Zhou J, Miao D, Wen J, Yue X (2021) Three-way decision with co-training for partially labeled data. Inf Sci 544:500–518

    Article  MathSciNet  MATH  Google Scholar 

  24. Xin X, Shi C, Sun J, Xue Z, Song J, Peng W (2022) A novel attribute reduction method based on intuitionistic fuzzy three-way cognitive clustering. Appl Intell :1–15

  25. Wu F, Jing X, Wei P, Lan C, Ji Y, Jiang G, Huang Q (2022) Semi-supervised multi-view graph convolutional networks with application to webpage classification. Inf Sci 591:142–154

    Article  Google Scholar 

  26. Idhammad M, Afdel K, Belouch M (2018) Semi-supervised machine learning approach for ddos detection. Appl Intell 48(10):3193–3208

    Article  Google Scholar 

  27. Mittal H, Pandey AC, Pal R, Tripathi A (2021) A new clustering method for the diagnosis of covid19 using medical images. Appl Intell 51(5):2988–3011

    Article  Google Scholar 

  28. Dai J, Hu Q, Zhang J, Hu H, Zheng N (2016) Attribute selection for partially labeled categorical data by rough set approach. IEEE Trans Cybern 47(9):2460–2471

    Article  Google Scholar 

  29. Hu S, Miao D, Zhang Z, Luo S, Zhang Y, Hu G (2018) A test cost sensitive heuristic attribute reduction algorithm for partially labeled data. In: International joint conference on rough sets, Springer, pp 257–269

  30. Xie X, Qin X, Huang G, Zhao W (2019) Attribute reduction for partially labeled data based on hypergraph models. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), pp 1434–1439

  31. Liu K, Yang X, Yu H, Mi J, Wang P, Chen X (2019) Rough set based semi-supervised feature selection via ensemble selector. Knowl-Based Syst 165:282–296

    Article  Google Scholar 

  32. Gao C, Zhou J, Miao D, Yue X, Wan J (2021) Granular-conditional-entropy-based attribute reduction for partially labeled data with proxy labels. Inf Sci 580:111–128

    Article  MathSciNet  Google Scholar 

  33. Wang R, Chen D, Kwong S (2013) Fuzzy-rough-set-based active learning. IEEE Trans Fuzzy Syst 22(6):1699–1704

    Article  Google Scholar 

  34. Min F, Liu F-L, Wen L-Y, Zhang Z-H (2019) Tri-partition cost-sensitive active learning through knn. Soft Comput 23(5):1557–1572

    Article  Google Scholar 

  35. Cekik R, Uysal AK (2020) A novel filter feature selection method using rough set for short text data. Expert Syst Appl 160:113691

    Article  Google Scholar 

  36. Kuo C, Shieh H (2015) A semi-supervised learning algorithm for data classification. Int J Pattern Recogn Artif Intell 29(05):1551007

    Article  MathSciNet  Google Scholar 

  37. Bharadwaj A, Ramanna S (2019) Categorizing relational facts from the web with fuzzy rough sets. Knowl Inf Syst 61(3):1695–1713

    Article  Google Scholar 

  38. Agrawal S, Ahmed R, Anand Kumar M, Ramanna S (2022) Categorizing relations via semi-supervised learning using a hybrid tolerance rough sets and genetic algorithm approach. In: Soft computing for data analytics, classification model, and control, Springer, pp 103–116

  39. Bougoudis I, Demertzis K, Iliadis L, Anezakis V-D, Papaleonidas A (2018) Fussffra, a fuzzy semi-supervised forecasting framework: the case of the air pollution in athens. Neural Comput Applic 29(7):375–388

    Article  Google Scholar 

  40. Zhou Z, Li M (2005) Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans Knowl Data Eng 17(11):1529–1541

    Article  Google Scholar 

  41. Yang X, Chen Y, Fujita H, Liu D, Li T (2022) Mixed data-driven sequential three-way decision via subjective–objective dynamic fusion. Knowl-Based Syst 237:107728

    Article  Google Scholar 

  42. Kostopoulos G, Karlos S, Kotsiantis S, Ragos O (2018) Semi-supervised regression: a recent review. J Intell Fuzzy Syst 35(2):1483–1500

    Article  Google Scholar 

  43. Xu W, Guo Y (2016) Generalized multigranulation double-quantitative decision-theoretic rough set. Knowl-based Syst 105:190–205

    Article  Google Scholar 

  44. Sang B, Yang L, Chen H, Xu W, Guo Y, Yuan Z (2019) Generalized multi-granulation double-quantitative decision-theoretic rough set of multi-source information system. Int J Approx Reason 115:157–179

    Article  MathSciNet  MATH  Google Scholar 

  45. Li W, Xu W, Zhang X, Zhang J (2021) Updating approximations with dynamic objects based on local multigranulation rough sets in ordered information systems. Artif Intell Rev 55:1821–1855

    Article  Google Scholar 

  46. Zhou Z, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439

    Article  Google Scholar 

  47. Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284

    Article  Google Scholar 

  48. Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybernet 8(1):355–370

    Article  Google Scholar 

  49. Zhang M, Zhou Z (2011) Cotrade: confident co-training with data editing. IEEE Trans Syst Man Cybernet Part B (Cybernet) 41(6):1612–1626

    Article  Google Scholar 

  50. Eibe F, Hall MA, Witten IH (2016) The weka workbench. In: Online appendix for data mining: practical machine learning tools and techniques Morgan Kaufmann. Elsevier, Amsterdam

  51. Sun L, Wang T, Ding W, Xu J, Lin Y (2021) Feature selection using fisher score and multilabel neighborhood rough sets for multilabel classification. Inf Sci 578:887–912

    Article  MathSciNet  Google Scholar 

  52. He X, Cai D, Niyogi P (2005) Laplacian score for feature selection, advances in nerual information processing systems, MIT Press, Cambridge

Download references

Acknowledgements

The authors would like to thank the Editor-in-Chief, Editor, and anonymous reviewers for their kind help and valuable comments. This work is supported in part by the National Natural Science Foundation of China (Nos. 61806127, 62076164), the Natural Science Foundation of Guangdong Province, China (No. 2021A1515011861), Shenzhen Science and Technology Program (No. JCYJ20210324094601005), and Shenzhen Institute of Artificial Intelligence and Robotics for Society.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Can Gao.

Ethics declarations

Competing interests

The authors declare that they have no competing interests that could influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, Z., Gao, C. & Zhou, J. Rough sets-based tri-trade for partially labeled data. Appl Intell 53, 17708–17726 (2023). https://doi.org/10.1007/s10489-022-04405-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04405-3

Keywords

Navigation