Semi-Supervised Fuzzy-Rough Feature Selection

Jensen, Richard; Vluymans, Sarah; Parthaláin, Neil Mac; Cornelis, Chris; Saeys, Yvan

doi:10.1007/978-3-319-25783-9_17

Richard Jensen¹⁸,
Sarah Vluymans^19,20,
Neil Mac Parthaláin¹⁸,
Chris Cornelis^19,21 &
…
Yvan Saeys^20,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9437))

1504 Accesses
9 Citations

Abstract

With the continued and relentless growth in dataset sizes in recent times, feature or attribute selection has become a necessary step in tackling the resultant intractability. Indeed, as the number of dimensions increases, the number of corresponding data instances required in order to generate accurate models increases exponentially. Fuzzy-rough set-based feature selection techniques offer great flexibility when dealing with real-valued and noisy data; however, most of the current approaches focus on the supervised domain where the data object labels are known. Very little work has been carried out using fuzzy-rough sets in the areas of unsupervised or semi-supervised learning. This paper proposes a novel approach for semi-supervised fuzzy-rough feature selection where the object labels in the data may only be partially present. The approach also has the appealing property that any generated subsets are also valid (super)reducts when the whole dataset is labelled. The experimental evaluation demonstrates that the proposed approach can generate stable and valid subsets even when up to 90 % of the data object labels are missing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
When \(B = \{a\}\), i.e., B is a singleton, \(R_a\) is written rather than \(R_{\{a\}}\).
2.
A t-norm \(\mathcal {T}\) is an increasing, commutative, associative \([0,1]^2 \rightarrow [0,1]\) mapping satisfying \(\mathcal {T}(x,1) = x\) for x in [0, 1].
3.
An implicator \(\mathcal {I}\) is a \([0,1]^2 \rightarrow [0,1]\) mapping that is decreasing in its first and increasing in its second argument, satisfying \(\mathcal {I}(0,0)=\mathcal {I}(0,1)=\mathcal {I}(1,1)=1\) and \(\mathcal {I}(1,0)=0\).

References

Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123 (1995)
Google Scholar
Cornelis, C., Jensen, R., Hurtado Martín, G., Ślȩzak, D.: Attribute selection with fuzzy decision reducts. Inf. Sci. 180(2), 209–224 (2010)
Article MathSciNet MATH Google Scholar
Dubois, D., Prade, H.: Putting rough sets and fuzzy sets together. In: Słowiński, R. (ed.) Intelligent Decision Support, pp. 203–232. Springer, Dordrecht (1992)
Chapter Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2010). http://archive.ics.uci.edu/ml
Jensen, R., Shen, Q.: New approaches to fuzzy-rough feature selection. IEEE Trans. Fuzzy Syst. 17(4), 824–838 (2009)
Article Google Scholar
Jensen, R., Tuson, A., Shen, Q.: Finding rough and fuzzy-rough set reducts with SAT. Inf. Sci. 255, 100–120 (2014)
Article MathSciNet MATH Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht (1991)
Book MATH Google Scholar
Radzikowska, A.M., Kerre, E.E.: A comparative study of fuzzy rough sets. Fuzzy Sets Syst. 126(2), 137–155 (2002)
Article MathSciNet MATH Google Scholar
Widz, S., Ślęzak, D.: Attribute Subset Quality Functions over a Universe of Weighted Objects. In: Kryszkiewicz, M., Cornelis, C., Ciucci, D., Medina-Moreno, J., Motoda, H., Raś, Z.W. (eds.) RSEISP 2014. LNCS, vol. 8537, pp. 99–110. Springer, Heidelberg (2014)
Chapter Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)
Article Google Scholar

Download references

Acknowledgment

Neil Mac Parthaláin would like to acknowledge the financial support for this research through NISCHR (National Institute for Social Care and Health Research) Wales, Grant reference: RFS-12-37. Sarah Vluymans is supported by the Special Research Fund (BOF) of Ghent University. Chris Cornelis was partially supported by the Spanish Ministry of Science and Technology under the project TIN2011-28488 and the Andalusian Research Plans P11-TIC-7765, P10-TIC-6858 and P12-TIC-2958.

Author information

Authors and Affiliations

Department of Computer Science, Aberystwyth University, Aberystwyth, Ceredigion, Wales, UK
Richard Jensen & Neil Mac Parthaláin
Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
Sarah Vluymans & Chris Cornelis
VIB Inflammation Research Center, Zwijnaarde, Belgium
Sarah Vluymans & Yvan Saeys
Department of Computer Science and AI CITIC-UGR, University of Granada, Granada, Spain
Chris Cornelis
Department of Respiratory Medicine, Ghent University, Ghent, Belgium
Yvan Saeys

Authors

Richard Jensen
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Vluymans
View author publications
You can also search for this author in PubMed Google Scholar
Neil Mac Parthaláin
View author publications
You can also search for this author in PubMed Google Scholar
Chris Cornelis
View author publications
You can also search for this author in PubMed Google Scholar
Yvan Saeys
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard Jensen .

Editor information

Editors and Affiliations

University of Regina, Regina, SK, Canada
Yiyu Yao
Tianjin University, Tianjin, China
Qinghua Hu
Chongqing University of Posts and Telecommunications, Chongqing, China
Hong Yu
University of Kansas, Lawrence, KS, USA
Jerzy W. Grzymala-Busse

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jensen, R., Vluymans, S., Parthaláin, N.M., Cornelis, C., Saeys, Y. (2015). Semi-Supervised Fuzzy-Rough Feature Selection. In: Yao, Y., Hu, Q., Yu, H., Grzymala-Busse, J.W. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Lecture Notes in Computer Science(), vol 9437. Springer, Cham. https://doi.org/10.1007/978-3-319-25783-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-25783-9_17
Published: 08 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25782-2
Online ISBN: 978-3-319-25783-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics