Neighbourhood discernibility degree-based semisupervised feature selection for partially labelled mixed-type data with granular ball

Shu, Wenhao; Yu, Jianhui; Chen, Ting; Qian, Wenbin

doi:10.1007/s10489-023-04657-7

Neighbourhood discernibility degree-based semisupervised feature selection for partially labelled mixed-type data with granular ball

Published: 29 June 2023

Volume 53, pages 22467–22487, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wenhao Shu¹,
Jianhui Yu¹,
Ting Chen¹ &
…
Wenbin Qian²

181 Accesses
Explore all metrics

Abstract

Feature selection can effectively decrease data dimensions by selecting a relevant feature subset. Rough set theory provides a powerful theoretical framework for the feature selection of categorical data with complete labels. However, in reality, the given datasets have only a small number of objects with label information and many unlabelled objects. Furthermore, most of feature selection approaches are computationally expensive. To address the above problems, a semisupervised feature selection algorithm based on neighbourhood discernibility with pseudolabelled granular balls is proposed. First, the set of granular balls based on the purity is generated, which reduces the universe space by sampling. Then, the neighbourhood discernibility is proposed to validate the importance of the candidate features for both labelled and unlabelled objects. Finally, an ensemble voting algorithm is designed to execute feature selection, and a feature subset with satisfactory performance is selected fairly not arbitrarily. On UCI datasets, experimental results verify the advantage of the proposed feature selection algorithm in terms of the feature subset size, classification accuracy and computational time against other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label feature selection based on fuzzy neighborhood rough sets

Article Open access 10 January 2022

Feature selection based on maximal neighborhood discernibility

Article 16 August 2017

A Feature Selection Method Based on Rough Set Attribute Reduction and Classical Filter-Based Feature Selection for Categorical Data Classification

Data availability and access

The data that support the findings of this study are openly available in the UCI machine learning repository at http://archive.ics.uci.edu/ml, reference number [44].

References

Khaire UM, Dhanalakshmi R (2022) Stability of feature selection algorithm: A review. Journal of King Saud University - Computer and Information Sciences 34(4):1060–1073
Article Google Scholar
Li X, Wang Y et al (2020) A Survey on Sparse Learning Models for Feature Selection. IEEE Transactions on Cybernetics 52(3):1642–1660
Article Google Scholar
Hancer E, Xue B et al (2022) Fuzzy filter cost-sensitive feature selection with differential evolution. Knowl-Based Syst 241:108259
Article Google Scholar
Huang P, Yang X (2022) Unsupervised feature selection via adaptive graph and dependency score. Pattern Recogn 127:108622
Hja B, Bao Q (2022) On (O, G)-fuzzy rough sets based on overlap and grouping functions over complete lattices. Int J Approximate Reasoning 144:18–50
Article MathSciNet MATH Google Scholar
Shu W, Yan Z et al (2022) Information granularity-based incremental feature selection for partially labeled hybrid data. Intelligent Data Analysis 26(1):33–56
Article Google Scholar
Hb A, Dla B et al (2022) Spatial rough set-based geographical detectors for nominal target variables. Inf Sci 586:525–539
Article Google Scholar
Jxa B, Bao Q et al (2022) A novel method to attribute reduction based on weighted neighborhood probabilistic rough sets. Int J Approximate Reasoning 144:1–17
Article MathSciNet Google Scholar
Chen B, Chen L et al (2022) Uncertainty Measurement and Attribute Reduction Algorithm Based on Kernel Similarity Rough Set Model. Journal of Mathematics 2022:5675200
Article MathSciNet Google Scholar
Hu Q, Yu D et al (2022) Granular computing based machine learning in the era of big data. Inf Sci 591:422–423
Article Google Scholar
Xia S, Zhang Z et al (2020) GBNRS: A Novel Rough Set Algorithm for Fast Adaptive Attribute Reduction in Classification. IEEE Trans Knowl Data Eng 34(3):1231–1242
Article Google Scholar
Qian Y, Liang X et al (2018) Local rough set: A solution to rough data analysis in big data. Int J Approximate Reasoning 97:38–63
Article MathSciNet MATH Google Scholar
Wan J, Chen H et al (2021) A novel hybrid feature selection method considering feature interaction in neighborhood rough set. Knowl-Based Syst 227:107167
Kim K, Jun C (2018) Rough set model based feature selection for mixed-type data with feature space decomposition. Expert Syst Appl 103:196–205
Article Google Scholar
Wang C, Huang Y et al (2019) Feature selection based on neighborhood self-information. IEEE Transactions on Cybernetics 50(1):4031–4042
Google Scholar
Pang Q, Zhang L (2020) Semi-supervised neighborhood discrimination index for feature selection. Knowl-Based Syst 204:106244
Liu K, Yang X et al (2019) Rough set based semi-supervised feature selection via ensemble selector. Knowl-Based Syst 165:282–296
Article Google Scholar
Dai J, Hu Q et al (2017) Attribute selection for partially labeled categorical data by rough set approach. IEEE Transcations on Cybernetics 47:2460–2471
Article Google Scholar
Dai J, Liu Q (2022) Semi-supervised attribute reduction for interval data based on misclassification cost. Int J Mach Learn Cybern 13(6):1739–1750
Article Google Scholar
Wang F, Liu J et al (2018) Semi-supervised feature selection algorithm based on information entropy. Computer Science 45:427–430
Google Scholar
Gao C, Zhou J (2021) Granular conditional entropy-based attribute reduction for partially labeled data with proxy labels. Inf Sci 580:111–128
Article MathSciNet Google Scholar
Liu K, Tsang E (2020) Neighborhood attribute reduction approach to partially labeled data. Granular Computing 5:239–250
Article Google Scholar
Jiang Z, Liu K et al (2021) Accelerator for crosswise computing reduct. Appl Soft Comput 98:106740
Ni P, Zhao S (2019) PARA: A positive-region based attribute reduction accelerator. Inf Sci 503:533–550
Article Google Scholar
Wang C, Huang Y et al (2019) Fuzzy rough set-based attribute reduction using distance measures. Knowl-Based Syst 164:205–212
Article Google Scholar
Dai J, Wang W et al (2019) Attribute selection based on a new conditional entropy for incomplete decision systems. Knowl-Based Syst 39:207–213
Article Google Scholar
Zhang X, Mei C et al (2020) Active incremental feature selection using a fuzzy-rough-set-based information entropy. IEEE Transacions on Fuzzy Systems 28(5):901–915
Article Google Scholar
Luo S, Miao D et al (2020) A neighborhood rough set model with nominal metric embedding. Inf Sci 520:373–388
Article MathSciNet MATH Google Scholar
Sun L, Zhang X et al (2019) Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf Sci 502:18–41
Article MathSciNet MATH Google Scholar
Wei W, Wu X et al (2018) Discernibility matrix based incremental attribute reduction for dynamic data. Knowl-Based Syst 140:142–157
Article Google Scholar
Lin R, Li J et al (2021) Attribute reduction in fuzzy multi-covering decision systems via observational-consistency and fuzzy discernibility. Journal of Intelligent & Fuzzy Systems 40(3):5239–5253
Article Google Scholar
Liu Y, Zheng L et al (2020) Discernibility matrix based incremental feature selection on fused decision tables. Int J Approximate Reasoning 118:1–26
Article MathSciNet MATH Google Scholar
Li L, Li M et al (2019) A simple discernibility matrix for attribute reduction in formal concept analysis based on granular concepts. Journal of Intelligent & Fuzzy Systems 37(3):4325–4337
Article Google Scholar
Sheng K, Wang W et al (2020) Neighborhood Discernibility Degree Incremental Attribute Reduction Algorithm for Mixed Data. Acta Electron Sin 48(04):682–696
Google Scholar
Jiang Z, Liu K et al (2020) Accelerator for supervised neighborhood based attribute reduction. Int J Approximate Reasoning 119:122–150
Article MathSciNet MATH Google Scholar
Jiang Z, Yang X et al (2019) Accelerator for multi-granularity attribute reduction. Knowl-Based Syst 177:145–158
Article Google Scholar
Chen Y, Wang P et al (2021) Granular ball guided selector for attribute reduction. Knowl-Based Syst 229:107326
Zhao J, Liang J et al (2020) Accelerating information entropy-based feature selection using rough set theory with classified nested equivalence classes. Pattern Recogn 107:107517
Rao X, Yang X et al (2020) Quickly calculating reduct: An attribute relationship based approach. Knowl-Based Syst 200(7):106014
Xia S, Liu Y et al (2019) Granular ball computing classifiers for efficient, scalable and robust learning. Inf Sci 483:136–152
Article MathSciNet Google Scholar
Xia S, Peng D et al (2020) A Fast Adaptive k-means with No Bounds. IEEE Trans Pattern Anal Mach Intell 44(1):87–99
Google Scholar
Ba J, Chen Y et al (2021) Quick Strategy for Searching Granular Ball Rough Set Based Reduct. Journal of Nanjing University of Science and Technology 45(4):394–400
Google Scholar
Shu W, Qian W et al (2020) Incremental feature selection for dynamic hybrid data using neighborhood rough set. Knowl-Based Syst 194:105516
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No.62266018 and No.61966016), the Natural Science Foundation of Jiangxi Province, China (No.20224BAB202020), and the National Key Research and Development Program of China (No.2020YFD1100605).

Author information

Authors and Affiliations

School of Information Engineering, East China Jiaotong University, Nanchang, China
Wenhao Shu, Jianhui Yu & Ting Chen
School of Software, Jiangxi Agricultural University, Nanchang, China
Wenbin Qian

Authors

Wenhao Shu
View author publications
You can also search for this author in PubMed Google Scholar
Jianhui Yu
View author publications
You can also search for this author in PubMed Google Scholar
Ting Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Qian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: Wenhao Shu; Jianhui Yu; Methodology: Jianhui Yu; Writing – original draft preparation: Jianhui Yu, Ting Chen; Writing – review and editing: Jianhui Yu,Wenhao Shu; Funding acquisition: Wenbin Qian.

Corresponding author

Correspondence to Wenbin Qian.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shu, W., Yu, J., Chen, T. et al. Neighbourhood discernibility degree-based semisupervised feature selection for partially labelled mixed-type data with granular ball. Appl Intell 53, 22467–22487 (2023). https://doi.org/10.1007/s10489-023-04657-7

Download citation

Accepted: 20 April 2023
Published: 29 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04657-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neighbourhood discernibility degree-based semisupervised feature selection for partially labelled mixed-type data with granular ball

Abstract

Access this article

Similar content being viewed by others

Multi-label feature selection based on fuzzy neighborhood rough sets

Feature selection based on maximal neighborhood discernibility

A Feature Selection Method Based on Rough Set Attribute Reduction and Classical Filter-Based Feature Selection for Categorical Data Classification

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Neighbourhood discernibility degree-based semisupervised feature selection for partially labelled mixed-type data with granular ball

Abstract

Access this article

Similar content being viewed by others

Multi-label feature selection based on fuzzy neighborhood rough sets

Feature selection based on maximal neighborhood discernibility

A Feature Selection Method Based on Rough Set Attribute Reduction and Classical Filter-Based Feature Selection for Categorical Data Classification

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation