Abstract
We propose a local–global classification scheme in which the feature space is, in a first phase, segmented by an unsupervised algorithm allowing, in a second phase, the application of distinct classification methods in each of the generated sub-regions. The proposed segmentation process intentionally produces difficult-to-classify and easy-to-classify sub-regions. Consequently, it is possible to outcome, besides of the classification labels, a measure of confidence for these labels. In almost homogeneous regions, one may be well-nigh sure of the classification result. The algorithm has a built-in stopping criterion to avoid over dividing the space, what would lead to overfitting. The Cauchy–Schwarz divergence is used as a measure of homogeneity in each partition. The proposed algorithm has shown very nice results when compared with 52 prototype selection algorithms. It also brings in the advantage of priory unveiling areas of the feature space where one should expect more (or less) difficult in classifying.
Similar content being viewed by others
References
Acampora G, Herrera F, Tortora G, Vitiello A (2018) A multi-objective evolutionary approach to training set selection for support vector machine. Knowl-Based Syst 147:94–108
Blachnik M, Duch W (2011) Lvq algorithm with instance weighting for generation of prototype-based rules. Neural Netw 24:824–830
Borlea ID, Precup AB R-E ans Borlea, Iercan D (2021) A unified form of fuzzy c-means and k-means algorithms and its partitional implementation. Knowl -Based Syst, pp: 1–16
Buhmann MD (2003) Radial basis functions: theory and implementations. Cambridge monographs on applied and computational mathematics. Cambridge University Press, Cambridge
Cavalcanti G, Soares R (2020) Ranking-based instance selection for pattern classification. Expert Syst Appl 150:113269
Cerruela-Garcia G, Perez-Parras Haro-Garcia T, Garcia-Pedrajas N (2019) Improving the combination of results in the ensembles of prototype selectors. Neural Netw 118:175–191
Chen D, Yang Q, Liu J, Zeng Z (2020) Selective prototype-based learning on concept-drifting data streams. Inf Sci 516:20–32
Chernoff H (1952) A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann Math Stat 23(4):493–507
Cortes C, Vapnik V (1995) Support-vector networks. Ann Math Stat 23(4):273–297. https://doi.org/10.1007/BF00994018
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27. https://doi.org/10.1109/TIT19671053964
Cover T, Thomas J (2006) Elements of information theory. Wiley-Interscience, Hoboken
Oliveira Cruz D R, Cavalcanti G, Sabourin R (2019) Fire-des++: enhanced online pruning of base classifiers for dynamic ensemble selection. Pattern Recognit 85:149–160
Cruz R, Sabourin R, Cavalcanti G (2018) Prototype selection for dynamic classifier and ensemble selection. Neural Comput Appl 29:447–457. https://doi.org/10.1007/s00521-016-2458-6
Derrac J, Verbiest N, Garcia S, Cornelis C, Herrera F (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17:223–238
Duda R, Hart P, David GS (2001) Pattern classification. Wiley, New York
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2011) An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Patern Recognition 44:1761–1776
Ganzalez M, Cano JR, Garcia S (2020) Prolsfeo-ldl: prototype selection and label- specific feature evolutionary optimization for label distribution learning. Appl Sci 10:3089
Garcia S, Derrac J, Cano JR, Herrera F (2012) Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on pattern analysis and machine intelligence pp: 417-435
Gou J, Zhan Y, Rao Y, Shen X, Wang X, He W (2014) Improved pseudo nearest neighbor classification. Knowl-Based Syst 70:361–375. https://doi.org/10.1016/jknosys201407020
Gu X, Ding W (2019) A hierarchical prototype-based approach for classification. Inf Sci 505:325–351
Gu X, Li M (2021) A multi-granularity locally optimal prototype-based approach for classification. Inf Sci 569:157–183
Gu X, Angelov P, Zhang C, Atkinson P (2018) A massively parallel deep rule-based ensemble classifier for remote sensing scenes. IEEE Geosci Remote Sens Lett 15:345–349
Gu X, Angelov P, Rong HJ (2019) Local optimality of self-organising neuro-fuzzy inference systems. Inf Sci 503:351–380
Harremoes P (2006) Interpretations of rényi entropies and divergences. Phys A Stat Mech Appl 365(1):57–62. https://doi.org/10.1016/j.physa.2006.01.012
Hu W, Tan Y (2016) Prototype generation using multiobjective particle swarm optimization for nearest neighbor classification. IEEE Transactions on Cybernetics pp: 1-12
Jenssen R, Principe J, Erdogmus D, Eltoft T (2006) The cauchy-schwarz divergence and parzen windowing: connections to graph theory and mercer kernels. J Franklin Inst 343(6):614–629. https://doi.org/10.1016/j.jfranklin.2006.03.018
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Le H, Landa-Slva D, Galar M, Garcia S, Triguero I (2021) Eusc: a clustering-based surrogate model to accelerate evolutionary undersampling in imbalanced classification. Appl Soft Comput 101:107033
Li IJ, Chen JC, Wu JL (2013) A fast prototype reduction method based on template reduction and visualization-induced self-organizing map for nearest neighbor algorithm. Appl Intell 39:564–582
Liaw RT (2021) A cooperative coevolution framework for evolutionary learning and instance selection. Swarm Evol Comput 52:100840
Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28:84–95
Liu C, Wang W, Tu G, Xiang Y, Lv Wang F S (2017) A new centroid-based classification model for text categorization. Knowl-Based Syst 136:15–26
Lorena A, de Carvalho A, Gama J (2008) A review on the combination of binary classifiers in multiclass problems. Artif Intell Rev 30:19–37. https://doi.org/10.1007/s10462-009-9114-9
Marcelino CG, Leite GMC, Celes P, Pedreira CE (2022) Missing data analysis in regression. Appl Arti Intell. https://doi.org/10.1080/08839514.2022.2032925
Mauceri S, Sweeney J, McDermott J (2020) Dissimilarity-based representations for one-class classification on time series. Pattern Recogn 100:107122
Melchert F, Bani G, Biehl Seiffert M U (2020) Adaptive basis functions for prototype-based classification of functional data. Neural Comput Appl 32:18213–18223
Murtza I, Abdullah D, Khan A, Arif M, Mirza S (2017) Cortex-inspired multilayer hierarchy based object detectionsystem using phog descriptors and ensemble classification. Vis Comput 33:99–112
Nielsen F (2012) Closed-form information-theoretic divergences for statistical mixtures. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp 1723–1726
Pan Z, Wang Y, Ku W (2017) A new general nearest neighbor classification based on the mutual neighborhood information. Knowl-Based Syst 121:142–152
Peres R, Pedreira C (2010) A new local global approach for classification. Neural Netw 23:887–891
Peres R, Aranha C, Pedreira C (2013) Optimized bi-dimensional data projection for clustering visualization. Inf Sci 232:104–115
Perner P (2008) Prototype-based classification. Appl Intell 28:238–246
Principe J, Xu D, Fisher J (2000) Information theoretic learning. Cambridge monographs on applied and computational mathematics. In: Unsupervised adaptive filtering. John Wiley & Sons. https://doi.org/10.1007/978-1-4419-1570-2
Rico-Juan J, Valero-Mas J, Calvo-Zagaroza J (2019) Extensions to rank-based prototype selection in k-nearest neighbour classification. Appl Soft Comput 85:105803
Silverman B (1998) Density estimation for statistics and data analysis. Monographs on statistics and applied probability. Chapman & Hall/CRC, Boca Raton
Sowkuntla P, Prasad P (2021) Mapreduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme. Knowl-Based Syst, pp: 1–13
Velliangiri S, Karthikeyan P (2020) Hybrid optimization scheme for intrusion detection using considerable feature selection. Neural Comput Appl 32:7925–7939. https://doi.org/10.1007/s00521-019-04477-2
Vuttipittayamongkol P, Elyan E, Petrovski A (2021) On the class overlap problem in imbalanced data classification. Knowl-Based Syst 212:106631
Wang P, Tang Z, Wang J (2021) A novel few-shot malware classification approach for unknown family recognition with multi-prototype modeling. Comput Secur 106:102273
Acknowledgements
This research has been partially supported Brazilian research agencies: CAPES (PROEX), FAPERJ, and CNPq.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Ethical approval
This paper contains no cases of studies with human participants performed by any of the authors.
Informed consent
This study does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Marcelino, C.G., Pedreira, C.E. Feature space partition: a local–global approach for classification. Neural Comput & Applic 34, 21877–21890 (2022). https://doi.org/10.1007/s00521-022-07647-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07647-x