One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices

Millán-Giraldo, Mónica; García, Vicente; Sánchez, J. Salvador

doi:10.1007/978-3-642-34166-3_43

Mónica Millán-Giraldo^24,25,
Vicente García²⁵ &
J. Salvador Sánchez²⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7626))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

2493 Accesses
1 Citations

Abstract

In the dissimilarity representation paradigm, several prototype selection methods have been used to cope with the topic of how to select a small representation set for generating a low-dimensional dissimilarity space. In addition, these methods have also been used to reduce the size of the dissimilarity matrix. However, these approaches assume a relatively balanced class distribution, which is grossly violated in many real-life problems. Often, the ratios of prior probabilities between classes are extremely skewed. In this paper, we study the use of renowned prototype selection methods adapted to the case of learning from an imbalanced dissimilarity matrix. More specifically, we propose the use of these methods to under-sample the majority class in the dissimilarity space. The experimental results demonstrate that the one-sided selection strategy performs better than the classical prototype selection methods applied over all classes.

Download to read the full chapter text

Chapter PDF

Towards Cluster-Based Prototype Sets for Classification in the Dissimilarity Space

Similar Prototype Methods for Class Imbalanced Data Classification

The δ-Machine: Classification Based on Distances Towards Prototypes

Article Open access 22 August 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Duin, R.P.W., Pękalska, E.: The dissimilarity space: Bridging structural and statistical pattern recognition. Pattern Recognition Letters 33(7), 826–832 (2012)
Article Google Scholar
Pekalska, E., Duin, R.P.W.: The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific (2005)
Google Scholar
Pekalska, E., Duin, R.P.W.: Dissimilarity representations allow for building good classifiers. Pattern Recognition Letters 23(8), 943–956 (2002)
Article MATH Google Scholar
Kim, S.W.: An empirical evaluation on dimensionality reduction schemes for dissimilarity-based classifications. Pattern Recognition Letters 32(6), 816–823 (2011)
Article Google Scholar
Duin, R.P.W., Pękalska, E.: The Dissimilarity Representation for Structural Pattern Recognition. In: San Martin, C., Kim, S.-W. (eds.) CIARP 2011. LNCS, vol. 7042, pp. 1–24. Springer, Heidelberg (2011)
Chapter Google Scholar
Pekalska, E., Duin, R.P.W., Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recognition 39(2), 189–208 (2006)
Article MATH Google Scholar
Plasencia-Calaña, Y., García-Reyes, E., Duin, R.P.W.: Prototype selection methods for dissimilarity space classification. Technical report, Advanced Technologies Application Center CENATAV
Google Scholar
Kim, S.W., Oommen, B.J.: On using prototype reduction schemes to optimize dissimilarity-based classification. Pattern Recognition 40(11), 2946–2957 (2007)
Article MATH Google Scholar
Plasencia-Calaña, Y., García-Reyes, E., Orozco-Alzate, M., Duin, R.P.W.: Prototype selection for dissimilarity representation by a genetic algorithm. In: Proc. 20th International Conference on Pattern Recognition, pp. 177–180 (2010)
Google Scholar
Chawla, N., Japkowicz, N., Kotcz, A.: Editorial: Special issue on learning from imbalanced data sets. SIGKDD Explorations 6(1), 1–6 (2004)
Article Google Scholar
Sun, Y., Wong, A., Kamel, M.S.: Classification of imbalanced data: A review. International Journal of Pattern Recognition and Artificial Intelligence 23(4), 687–719 (2009)
Article Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations 6(1), 20–29 (2004)
Article Google Scholar
Lozano, M., Sotoca, J.M., Sánchez, J.S., Pla, F., Pekalska, E., Duin, R.P.W.: Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recognition 39, 1827–1838 (2006)
Article MATH Google Scholar
Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. on Information Theory 14, 515–516 (1968)
Article Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. on Systems, Man and Cybernetics 2(3), 408–421 (1972)
Article MATH Google Scholar
Daskalaki, S., Kopanas, I., Avouris, N.: Evaluation of classifiers for an uneven class distribution problem. Applied Artificial Intelligence 20(5), 381–417 (2006)
Article Google Scholar
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proc. 3rd International Conference on Knowledge Discovery and Data Mining, pp. 43–48 (1997)
Google Scholar
Sokolova, M.V., Japkowicz, N., Szpakowicz, S.: Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation. In: Sattar, A., Kang, B.-H. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006)
Chapter Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7(1), 1–30 (2006)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Data Analysis Laboratory, University of Valencia, Av. Universitat s/n, 46100, Burjassot, Valencia, Spain
Mónica Millán-Giraldo
Institute of New Imaging Technologies, Department of Computer Languages and Systems, University Jaume I, Av. Sos Baynat s/n, 12071, Castelló de la Plana, Spain
Mónica Millán-Giraldo, Vicente García & J. Salvador Sánchez

Authors

Mónica Millán-Giraldo
View author publications
You can also search for this author in PubMed Google Scholar
Vicente García
View author publications
You can also search for this author in PubMed Google Scholar
J. Salvador Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Auckland, Private Bag 92019, 1142, Auckland, New Zealand
Georgy Gimel’farb
Department of Computer Science, University of York, Deramore Lane, YO10 5GH, York, UK
Edwin Hancock
Institute of Media and Information Technology, Chiba University, Yayoi-cho 1-33, 263-8522, Inage-ku, Chiba, Japan
Atsushi Imiya
Technische Universität/Fraunhofer IGD, Fraunhoferstraße 5, 64283, Darmstadt, Germany
Arjan Kuijper
Graduate School of Information Science and Technology, Hokkaido University, 060-0814, Sapporo, Japan
Mineichi Kudo
Graduate School of Engineering, Tohoku University, 6-6-05 Aoba, Aramaki, Aoba-ku, 980-8579, Sendai, Miyagi, Japan
Shinichiro Omachi
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH, Guildford, Surrey, UK
Terry Windeatt
C&C Innovation Research Laboratories, NEC Corporation, 8916-47 Takayama-cho, Ikoma-Shi, Nara, Japan
Keiji Yamada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Millán-Giraldo, M., García, V., Sánchez, J.S. (2012). One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices. In: Gimel’farb, G., et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2012. Lecture Notes in Computer Science, vol 7626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34166-3_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-34166-3_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34165-6
Online ISBN: 978-3-642-34166-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices

Abstract

Chapter PDF

Similar content being viewed by others

Towards Cluster-Based Prototype Sets for Classification in the Dissimilarity Space

Similar Prototype Methods for Class Imbalanced Data Classification

The δ-Machine: Classification Based on Distances Towards Prototypes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices

Abstract

Chapter PDF

Similar content being viewed by others

Towards Cluster-Based Prototype Sets for Classification in the Dissimilarity Space

Similar Prototype Methods for Class Imbalanced Data Classification

The δ-Machine: Classification Based on Distances Towards Prototypes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation