Abstract
In many pattern recognition applications feature selection and instance selection can be used as two data preprocessing methods that aim at reducing the computational cost of the learning process. Moreover, in some cases, feature subset selection can improve the classification performance. Feature selection and instance selection can be interesting since the choice of features and instances greatly influence the performance of the learnt models as well as their training costs. In the past, unifying both problems was carried out by solving a global optimization problem using meta-heuristics. This paradigm not only does not exploit the manifold structure of data but can be computationally expensive. To the best of our knowledge, the joint use of sparse modeling representative and feature subset relevance have not been exploited by the joint feature and selection methods. In this paper, we target the joint feature and instance selection by adopting feature subset relevance and sparse modeling representative selection. More precisely, we propose three schemes for the joint feature and instance selection. The first is a wrapper technique while the two remaining ones are filter approaches. In the filter approaches, the search process adopts a genetic algorithm in which the evaluation is mainly given by a score that quantify the goodness of the features and instances. An efficient instance selection technique is used and integrated in the search process in order to adapt the instances to the candidate feature subset. We evaluate the performance of the proposed schemes using image classification where classifiers are the nearest neighbor classifier and support vector machine classifier. The study is conducted on five public image datasets. These experiments show the superiority of the proposed schemes over various baselines. The results confirm that the filter approaches leads to promising improvement on classification accuracy when both feature selection and instance selection are adopted.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Aghazadeh A, Spring R, LeJeune D, Dasarathy G, Shrivastava A, Baraniuk R (2018) Mission: ultra large-scale feature selection using count-sketches. In: ICML
Ahn H, Kim K (2009) Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach. Appl Soft Comput 9:599–607
Angulo AP, Shin K (2018) Mrmr+ and Cfs+ feature selection algorithms for high-dimensional data. Appl Intell 49(5):1954–1967
Becker B, Ortiz E (2013) Evaluating open-universe face identification on the web. In: IEEE conference on computer vision and pattern recognition workshops
Bien J, Tibshirani RJ (2011) Sparse estimation of a covariance matrix. Biometrika 98:807–820
Blachnik M (2014) Ensembles of instance selection methods based on feature subset. Proc Comput Sci 35:388–396
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Bradley PS, Mangasarian OL (1998) Feature selection via concave minimization and support vector machines. ICML 98:82–90
Chen J-H, Chen H-M, Ho S-Y (2005) Design of nearest neighbor classifiers: multi-objective approach. Int J Approx Reason 40(1–2):3–22
Chen H-T, Chang H-W, Liu T-L (2005) Local discriminant embedding and its variants. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, volume 2, pp 846–853. IEEE
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Derrac J, Garcia S, Herrera F (2010) IFS-CoCo: instance and feature selection based on cooperative coevolution with nearest neighbor rule. Pattern Recognit 43:2082–2105
Dornaika F, Aldine IK (2015) Decremental sparse modeling representative selection for prototype selection. Pattern Recognit 48(11):3717–3727
Dornaika F, Aldine I Kamal (2015) Decremental sparse modeling representative selection for prototype selection. Pattern Recognit 48(11):3714–3727
Dornaika F, Aldine IK (2018) Instance selection using non-linear sparse modeling. IEEE Trans Circuits Syst Video Technol 28(6):1457–1461
Dornaika F, Bosaghzadeh A (2015) Adaptive graph construction using data self-representativeness for pattern classification. Inf Sci 325:118–139
Dornaika F, El Traboulsi Y (2016) Learning flexible graph-based semi-supervised embedding. IEEE Trans Cybern 46(1):206–218
Dornaika F, Aldine IK, Cases B (2015) Exemplar selection using collaborative neighbor representation. In: Hybrid artificial intelligence systems, volume LNAI, 9121
Du W, Cao Z, Song T, Li Y, Liang Y (2017) A feature selection method based on multiple kernel learning with expression profiles of different types. BioData Min 10:4
Elhamifar E, Sapiro G, Vidal R (2012) See all by looking at a few: sparse modeling for finding representative objects. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 1600–1607. IEEE
Elhamifar E, Vidal R (2011) Robust classification using structured sparse representation. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 1873–1879
Fernández A, Carmona CJ, del Jesus MJ, Herrera F (2018) A pareto based ensemble with feature and instance selection for learning from multi-class imbalanced datasets. In: Proceedings of the XVIII Conferencia de la Asociación Española para la Inteligencia Artificial (XVIII CAEPIA), pp 1316–1317
Gu Q, Li Z, Han J (2012) Generalized fisher score for feature selection. arXiv preprintarXiv:1202.3725
Gunal S, Edizkan R (2008) Subspace based feature selection for pattern recognition. Inf Sci 178(19):3716–3726
He K, Zhang X, Ren S, Sun J (2016) Exemplar selection using collaborative neighbor representation. In: IEEE conference on computer vision and pattern recognition (CVPR)
Ishibuchi H, Nakashima T (2000) Multi-objective pattern and feature selection by a genetic algorithm. In: Proceedings of the 2nd annual conference on genetic and evolutionary computation, pp 1069–1076. Morgan Kaufmann Publishers Inc
Kaufman L, Rousseeuw P (1987) Statistical data analysis based on the L1-Norm, chapter Clustering by means of medoids, pp 405–416
Keinosuke F (1990) Introduction to statistical pattern recognition. Academic Press Inc, London
Kirkpatrick S, Gelatt CD, Vecchi MP et al (1983) Optimization by simulated annealing. Science 220(4598):671–680
Kuncheva LI, Jain LC (1999) Nearest neighbor classifier: simultaneous editing and feature selection. Pattern Recognit Lett 20(11):1149–1156
Kuri-Morales A, Rodríguez-Erazo F (2009) A search space reduction methodology for data mining in large databases. Eng Appl Artif Intell 22(1):57–65
Li Y, Maguire L (2011) Selecting critical patterns based on local geometrical and statistical information. IEEE Trans Pattern Anal Mach Intell 33(6):1189–201
Lim H, Lee J, Kim D-W (2017) Optimization approach for feature selection in multi-label classification. Pattern Recognit Lett 89:25–30
Liu Y, Nie F, Wu J, Chen L (2013) Efficient semi-supervised feature selection with noise insensitive trace ratio criterion. Neurocomputing 105:12–18
Mohamed R, Yusof MM, Wahidi N (2018) A comparative study of feature selection techniques for bat algorithm in various applications. In: MATEC Web of Conferences, vol 150
Nie F, Wang Z, Wang R, Li X (2019) Submanifold-preserving discriminant analysis with an auto-optimized graph. IEEE Trans Cybern
Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. In: AAAI
Olvera-Lopez JA, Carrasco-Ochoa JA, Martinez-Trinidad JF (2008) Prototype selection via prototype relevance. In: IberoAmerican Congress on Pattern Recognition, LNCS 5197
Pelikan M, Mühlenbein H (1998) Marginal distributions in evolutionary algorithms. In: Proceedings of the international conference on genetic algorithms mendel, vol 98, pp 90–95. Citeseer
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Perez-Rodriguez J, Arroyo-Pena AG, Garcia-Pedrajas N (2015) Simultaneous instance and feature selection and weighting using evolutionary computation: proposal and study. Appl Soft Comput 37:416–443
Ramirez-Cruz J-F, Fuentes O, Alarcon-Aquino V, Garcia-Banuelos L (2006) Instance selection and feature weighting using evolutionary algorithms. In: 15th international conference on computing, 2006. CIC’06, pp 73–79. IEEE
Roffo G, Melzi S, Castellani U, Vinciarelli A (2017) Infinite latent feature selection: a probabilistic latent graph-based ranking approach. arXiv:1707.07538
Ros F, Guillaume S, Pintore M, Chrétien JR (2008) Hybrid genetic algorithm for dual selection. Pattern Anal Appl 11(2):179–198
Sierra B, Lazkano E, Inza I, Merino M, Larrañaga P, Quiroga J (2001) Prototype selection and feature subset selection by estimation of distribution algorithms. a case study in the survival of cirrhotic patients treated with tips. In: Conference on artificial intelligence in medicine in Europe, pp 20–29, Springer
Staczyk U, Zielosko B, Jain LC (2018) Advances in feature selection for data and pattern recognition. Springer, Berlin
Suganthi M, Karunakaran V (2018) Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree. Cluster Computing
Sun Y, Todorovic S, Goodison S (2010) Local learning based feature selection for high dimensional data analysis. IEEE Trans Pattern Anal Mach Intell 32(9):1–18
Teixeira J, Ferreira R, Lima G (2008) A novel approach for integrating feature and instance selection. In: International Conference on machine learning and cybernetics
Tsai C, Eberle W, Chu C (2013) Genetic algorithms in feature and instance selection. Knowledge-Based Syst 39:240–247
Tsai C-F, Wu J-W (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34(4):2639–2649
Wen J, Xu Y, Li Z, Ma Z i, Xu Y (2018) Inter-class sparsity based discriminative least square regression. Neural Netw 102:36–47
Wen J, Fang X, Cui J, Fei L, Yan K, Chen Y, Xu Y (2018) Robust sparse linear discriminant analysis. IEEE Trans Circuits Syst Video Technol
Wilson D, Martinez T (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38:257–286
Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 40(8):2251–2265
Yang W, Li D, Zhu L (2011) An improved genetic algorithm for optimal feature subset selection from multi-character feature set. Expert Syst Appl 38(3):2733–2740
Yin Z-X, Chiang J-H (2008) Novel algorithm for coexpression detection in time-varying microarray data sets. IEEE/ACM Trans Comput Biol Bioinform 5(1):120–135
Yin J, Yin Z, Lai Z, Zeng W, Wei L (2018) Local sparsity preserving projectionand its application to biometric recognition. Multimed Tools Appl 77:1069–1092
Zaffalon M, Hutter M (2002) Robust feature selection using distributions of mutual information. In: Proceedings of the 18th international conference on uncertainty in artificial intelligence (UAI-2002), pp 577–584
Zhang A, Gao X (2018) Supervised data-dependent kernel sparsity preserving projection for image recognition. Appl Intell 48(12):4923–4936
Zhu R, Dornaika F, Ruichek Y (2019) Learning a discriminant graph-based embedding with feature selection for image categorization. Neural Netw 111:35–46
Zhu X, Li X, Zhang S, Ju C, Wu X (2017) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1274
Zhu R, Dornaika F, Ruichek Y (2019) Joint graph based embedding and feature weighting for image classification. Pattern Recognit
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dornaika, F. Joint feature and instance selection using manifold data criteria: application to image classification. Artif Intell Rev 54, 1735–1765 (2021). https://doi.org/10.1007/s10462-020-09889-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09889-4