Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection

Benabdeslem, Khalid; Elghazel, Haytham; Hindawi, Mohammed

doi:10.1007/s10115-015-0901-0

Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection

Regular Paper
Published: 23 November 2015

Volume 49, pages 1161–1185, (2016)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Khalid Benabdeslem¹,
Haytham Elghazel¹ &
Mohammed Hindawi²

568 Accesses
14 Citations
Explore all metrics

Abstract

In this paper, we propose an efficient and robust approach for semi-supervised feature selection, based on the constrained Laplacian score. The main drawback of this method is the choice of the scant supervision information, represented by pairwise constraints. In fact, constraints are proven to have some noise which may deteriorate learning performance. In this work, we try to override any negative effects of constraint set by the variation of their sources. This is achieved by an ensemble technique using both a resampling of data (bagging) and a random subspace strategy. Experiments on high-dimensional datasets are provided for validating the proposed approach and comparing it with other representative feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3-3FS: ensemble method for semi-supervised multi-label feature selection

Article 28 October 2021

A multiple classifiers system with roulette-based feature subspace selection for one-vs-one scheme

Article 27 July 2022

Global and local scatter based semi-supervised dimensionality reduction with active constraints selection in ensemble subspaces

Article 02 February 2016

Notes

http://perso.univ-lyon1.fr/haytham.elghazel/EnsCLS/EnsCLS.zip.

References

Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511
Google Scholar
Barkia H, Elghazel H, Aussem A (2011) Semi-supervised feature importance evaluation with ensemble learning. In: IEEE ICDM, pp 31–40
Benabdeslem K, Hindawi M (2011) Constrained laplacian score for semi-supervised feature selection. In: Proceedings of ECML-PKDD conference, pp 204–218
Benabdeslem K, Hindawi M (2014) Efficient semi-supervised feature selection: constraint, relevance and redundancy. IEEE Trans Knowl Data Eng 26(5):1131–1143
Google Scholar
Frank A, Asuncion A (2010) UCI machine learning repository. Available at http://archive.ics.uci.edu/ml
Breiman L (1996) Bagging predictors. Mach Learn 26(2):123–140
MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MathSciNet MATH Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Cormen TH, Stein C, Rivest RL, Leiserson CE (2001) Introduction to algorithms. McGraw-Hill Higher Education, New York
MATH Google Scholar
Davidson I, Wagstaff K, Basu S (2006) Measuring constraint-set utility for partitional clustering algorithms. In: Proceedings of ECML/PKDD
Demsar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Dietterich T (2000) Ensemble methods in machine learning. In: First international workshop on multiple classifier systems, pp 1–15
Duda RO, Hart PE, Stork DG (2000) Pattern classification. Wiley Interscience, New York
MATH Google Scholar
Dy J, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889
MathSciNet MATH Google Scholar
Elghazel H, Aussem A (2015) Unsupervised feature selection with ensemble learning. Mach Learn 98(1–2):157–180
MathSciNet MATH Google Scholar
Freund Y, Shapire R (1996) Experiments with a new boosting algorithm. In: 13th international conference on machine learning, pp 276–280
Golub T, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, Coller H (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
MATH Google Scholar
Guyon I, Gunn S, Nikravesh M, Zadeh LA (2006) Feature extraction: foundations and applications. Series studies in fuzziness and soft computing. Physica-Verlag, Springer, Berlin
Google Scholar
Hady MFA, Schwenker F (2010) Combining committee-based semi-supervised learning and active learning. J Comput Sci Technol 25(4):681–698
MathSciNet Google Scholar
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 17:507–514
Hindawi M, Allab K, Benabdeslem K (2011) Constraint selection based semi-supervised feature selection. In: Proceedings of IEEE ICDM, pp 1080–1085
Hindawi M, Elghazel H, Benabdeslem K (2013) Efficient semi-supervised feature selection by an ensemble approach. In: COPEM@ECML/PKDD. International workshop on complex machine learning problems with ensemble methods, pp 41–55
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Google Scholar
Hong Y, Kwong S, Chang Y, Ren Q (2008) Consensus unsupervised feature ranking from multiple views. Pattern Recognit Lett 29(5):595–602
Google Scholar
Hong Y, Kwong S, Chang Y, Ren Q (2008) Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recognit 41(9):2742–2756
MATH Google Scholar
Kalakech M, Biela P, Macaire L, Hamad D (2011) Constraint scores for semi-supervised feature selection: a comparative study. Pattern Recognit Lett 32(5):656–665
Google Scholar
Kohonen T (2001) Self organizing map. Springer, Berlin
MATH Google Scholar
Kuncheva LI (2007) A stability index for feature selection. In: Artificial intelligence and applications, pp 421–427
Li M, Zhou ZH (2007) Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Trans Syst Man Cybern 37(6):1088–1098
Google Scholar
Saeys Y, Abeel T, de Peer YV (2008) Robust feature selection using ensemble feature selection techniques. In: ECML/PKDD (2), pp 313–325
Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2):203–209
Google Scholar
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
MathSciNet MATH Google Scholar
Sun D, Zhang D (2010) Bagging constraint score for feature selection with pairwise constraints. Pattern Recognit 43:2106–2118
MATH Google Scholar
Sun Y, Todorovic S, Goodison S (2010) Local learning based feature selection for high dimensional data analysis. IEEE Trans Pattern Anal Mach Intell 32(9):1610–1626
Google Scholar
Topchy A, Jain A, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881
Google Scholar
Yaslan Y, Cataltepe Z (2010) Co-training with relevant random subspaces. Neurocomputing 73(10–12):1652–1661
Google Scholar
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of international conference on machine leaning, pp 856–863
Zhang D, Chen S, Zhou Z (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recognit 41(5):1440–1451
MATH Google Scholar
Zhao Z, Liu H (2007) Semi-supervised feature selection via spectral analysis. In: Proceedings of SIAM data mining (SDM), pp 641–646
Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A (2010) Advancing feature selection research—ASU feature selection repository. TR-10-007

Download references

Acknowledgments

We thank anonymous reviewers for their very useful comments and suggestions.

Author information

Authors and Affiliations

University of Lyon1 - LIRIS, 43 Bd du 11 Novembre 1918, 69622, Villeurbanne, France
Khalid Benabdeslem & Haytham Elghazel
Computer Science Department, Zirve University, Kizilhisar Campus, 27260, Gaziantep, Turkey
Mohammed Hindawi

Authors

Khalid Benabdeslem
View author publications
You can also search for this author in PubMed Google Scholar
Haytham Elghazel
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Hindawi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khalid Benabdeslem.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

This work represents an extension of our recently presented idea on the workshop Copem@ECML/PKDD’13 [23].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benabdeslem, K., Elghazel, H. & Hindawi, M. Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection. Knowl Inf Syst 49, 1161–1185 (2016). https://doi.org/10.1007/s10115-015-0901-0

Download citation

Received: 20 November 2014
Revised: 19 July 2015
Accepted: 13 November 2015
Published: 23 November 2015
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10115-015-0901-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection

Abstract

Access this article

Similar content being viewed by others

3-3FS: ensemble method for semi-supervised multi-label feature selection

A multiple classifiers system with roulette-based feature subspace selection for one-vs-one scheme

Global and local scatter based semi-supervised dimensionality reduction with active constraints selection in ensemble subspaces

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection

Abstract

Access this article

Similar content being viewed by others

3-3FS: ensemble method for semi-supervised multi-label feature selection

A multiple classifiers system with roulette-based feature subspace selection for one-vs-one scheme

Global and local scatter based semi-supervised dimensionality reduction with active constraints selection in ensemble subspaces

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation