Knowledge and Information Systems

, Volume 56, Issue 1, pp 55–84 | Cite as

Dynamic affinity-based classification of multi-class imbalanced data with one-versus-one decomposition: a fuzzy rough set approach

  • Sarah Vluymans
  • Alberto Fernández
  • Yvan Saeys
  • Chris Cornelis
  • Francisco Herrera
Regular Paper


Class imbalance occurs when data elements are unevenly distributed among classes, which poses a challenge for classifiers. The core focus of the research community has been on binary-class imbalance, although there is a recent trend toward the general case of multi-class imbalanced data. The IFROWANN method, a classifier based on fuzzy rough set theory, stands out for its performance in two-class imbalanced problems. In this paper, we consider its extension to multi-class data by combining it with one-versus-one decomposition. The latter transforms a multi-class problem into two-class sub-problems. Binary classifiers are applied to these sub-problems, after which their outcomes are aggregated into one prediction. We enhance the integration of IFROWANN in the decomposition scheme in two steps. Firstly, we propose an adaptive weight setting for the binary classifier, addressing the varying characteristics of the sub-problems. We call this modified classifier IFROWANN-\({{\mathcal {W}}_{\mathrm{IR}}}\). Second, we develop a new dynamic aggregation method called WV–FROST that combines the predictions of the binary classifiers with the global class affinity before making a final decision. In a meticulous experimental study, we show that our complete proposal outperforms the state-of-the-art on a wide range of multi-class imbalanced datasets.


Imbalanced data Multi-class classification One-versus-one Fuzzy rough set theory 



The research of Sarah Vluymans is funded by the Special Research Fund (BOF) of Ghent University. This work was partially supported by the Spanish Ministry of Science and Technology under the Projects TIN2014-57251-P and TIN2015-68454-R; the Andalusian Research Plans P11-TIC-7765 and P12-TIC-2958. Yvan Saeys is an ISAC Marylou Ingram Scholar.


  1. 1.
    Abdi L, Hashemi S (2016) To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans Knowl Data Eng 28(1):238–251CrossRefGoogle Scholar
  2. 2.
    Alshomrani S, Bawakid A, Shim S, Fernández A, Herrera F (2015) A proposal for evolutionary fuzzy systems using feature weighting: dealing with overlapping in imbalanced datasets. Knowl Based Syst 73:1–17CrossRefGoogle Scholar
  3. 3.
    Barandela R, Sánchez J, García V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recog 36(3):849–851CrossRefGoogle Scholar
  4. 4.
    Batista G, Prati R, Monard MC (2004) A study of the behaviour of several methods for balancing machine learning training data. SIGKDD Explor 6(1):20–29CrossRefGoogle Scholar
  5. 5.
    Britto AS Jr, Sabourin R, de Oliveira LES (2014) Dynamic selection of classifiers—a comprehensive review. Pattern Recog 47(1):3665–3680CrossRefGoogle Scholar
  6. 6.
    Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357zbMATHGoogle Scholar
  7. 7.
    Chen Y (2016) An empirical study of a hybrid imbalanced-class DT–RST classification procedure to elucidate therapeutic effects in uremia patients. Med Biol Eng Comput 54(6):983–1001CrossRefGoogle Scholar
  8. 8.
    Cornelis C, Verbiest N, Jensen R (2010) Ordered weighted average based fuzzy rough sets. In: Yu J, Greco S, Lingras P, Wang G, Skowron A (eds) Rough set and knowledge technology. Springer, Berlin, pp 78–85CrossRefGoogle Scholar
  9. 9.
    D’eer L, Verbiest N, Cornelis C, Godo L (2015) A comprehensive study of implicator–conjunctor-based and noise-tolerant fuzzy rough sets: definitions, properties and robustness analysis. Fuzzy Sets Syst 275:1–38MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
  11. 11.
    Domingos P (1999) MetaCost: a general method for making classifiers cost—sensitive. In: Fayyad U, Chaudhuri S, Madigan D (eds) Proceedings of the 5th international conference on knowledge discovery and data mining (KDD’99). ACM, New York, pp 155–164Google Scholar
  12. 12.
    Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209CrossRefzbMATHGoogle Scholar
  13. 13.
    Fei B, Liu J (2006) Binary tree of SVM: a new fast multiclass training and classification algorithm. IEEE Trans Neural Netw 17(3):696–704MathSciNetCrossRefGoogle Scholar
  14. 14.
    Fernández A, Calderon M, Barrenechea E, Bustince H, Herrera F (2010a) Solving multi-class problems with linguistic fuzzy rule based classification systems based on pairwise learning and preference relations. Fuzzy Sets Syst 161(23):3064–3080MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Fernández A, García S, Luengo J, Bernado-Mansilla E, Herrera F (2010b) Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941CrossRefGoogle Scholar
  16. 16.
    Fernández A, López V, Galar M, Del Jesus MJ, Herrera F (2013) Analysing the classification of imbalanced data-sets with multiple classes: binarization techniques and ad-hoc approaches. Knowl Based Syst 42:97–110CrossRefGoogle Scholar
  17. 17.
    Friedman JH (1996) Another approach to polychotomous classification. Tech rep, Department of Statistics, Stanford University.
  18. 18.
    Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701CrossRefzbMATHGoogle Scholar
  19. 19.
    Fürnkranz J, Hüllermeier E, Vanderlooy S (2009) Binary Decomposition Methods for Multipartite Ranking. In: Buntine W, Grobelnik M, Mladenić D, Shawe-Taylor J (eds.) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science, vol 5781. Springer, Berlin, HeidelbergGoogle Scholar
  20. 20.
    Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2011) An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recog 44(8):1761–1776CrossRefGoogle Scholar
  21. 21.
    Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2013) Dynamic classifier selection for one-vs-one strategy: avoiding non-competent classifiers. Pattern Recog 46(12):3412–3424CrossRefGoogle Scholar
  22. 22.
    Galar M, Fernández A, Barrenechea E, Herrera F (2015) DRCW-OVO: distance-based relative competence weighting combination for one-vs-one strategy in multi-class problems. Pattern Recog 48(1):28–42CrossRefGoogle Scholar
  23. 23.
    Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2016) Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets. Inf Sci 354:178–196CrossRefGoogle Scholar
  24. 24.
    Gao X, Chen Z, Tang S, Zhang Y, Li J (2016) Adaptive weighted imbalance learning with application to abnormal activity recognition. Neurocomputing 173:1927–1935CrossRefGoogle Scholar
  25. 25.
    Gao Z, Zhang L, Chen M, Hauptmann A, Zhang H, Cai A (2014) Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset. Multimed Tools Appl 68(3):641–657CrossRefGoogle Scholar
  26. 26.
    García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064CrossRefGoogle Scholar
  27. 27.
    García V, Mollineda RA, Sánchez JS (2008) On the k-nn performance in a challenging scenario of imbalance and overlapping. Pattern Anal Appl 11(3–4):269–280MathSciNetCrossRefGoogle Scholar
  28. 28.
    Haixiang G, Yijing L, Yanan L, Xiao L, Jinling L (2016) BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artifl Intell 49:176–193CrossRefGoogle Scholar
  29. 29.
    Hand DJ, Till RJ (2001) A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRefzbMATHGoogle Scholar
  30. 30.
    Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451–471MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRefGoogle Scholar
  32. 32.
    Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6(2): 65–70Google Scholar
  33. 33.
    Huhn J, Hüllermeier E (2009) FR3: a fuzzy rule learner for inducing reliable classifiers. IEEE Trans Fuzzy Syst 17(1):138–149CrossRefGoogle Scholar
  34. 34.
    Hüllermeier E, Brinker K (2008) Learning valued preference structures for solving classification problems. Fuzzy Sets Syst 159(18):2337–2352MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Hüllermeier E, Vanderlooy S (2010) Combining predictions in pairwise classification: an optimal adaptive voting strategy and its relation to weighted voting. Pattern Recog 43(1):128–142CrossRefzbMATHGoogle Scholar
  36. 36.
    Jensen R, Cornelis C (2011) Fuzzy-rough nearest neighbour classification and prediction. Theor Comput Sci 412(42):5871–5884MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Kuncheva L, Bezdek J, Duin R (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recog 34(2):299–314CrossRefzbMATHGoogle Scholar
  38. 38.
    Liu B, Hao Z, Yang X (2007) Nesting algorithm for multi-classification problems. Soft Comput 11(4):383–389CrossRefzbMATHGoogle Scholar
  39. 39.
    Liu B, Hao Z, Tsang ECC (2008) Nesting one-against-one algorithm based on SVMs for pattern classification. IEEE Trans Neural Netw 19(12):2044–2052CrossRefGoogle Scholar
  40. 40.
    López V, Fernández A, Moreno-Torres JG, Herrera F (2012) Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Syst Appl 39(7):6585–6608CrossRefGoogle Scholar
  41. 41.
    López V, Fernández A, Del Jesus M, Herrera F (2013a) A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets. Knowl Based Syst 38:85–104CrossRefGoogle Scholar
  42. 42.
    López V, Fernández A, García S, Palade V, Herrera F (2013b) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141CrossRefGoogle Scholar
  43. 43.
    López V, Fernández A, Herrera F (2014) On the importance of the validation technique for classification with imbalanced datasets: addressing covariate shift when data is skewed. Inf Sci 257:1–13CrossRefGoogle Scholar
  44. 44.
    Lorena AC, Carvalho AC, Gama JM (2008) A review on the combination of binary classifiers in multiclass problems. Artif Intell Rev 30(1–4):19–37CrossRefGoogle Scholar
  45. 45.
    Mahalanobis P (1936) On the generalized distance in statistics. Proc Natl Inst Sci (Calcutta) 2:49–55zbMATHGoogle Scholar
  46. 46.
    Martínez-Munoz G, Hernández-Lobato D, Suárez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intellig 31(2):245–259CrossRefGoogle Scholar
  47. 47.
    Moreno-Torres JG, Sáez JA, Herrera F (2012) Study on the impact of partition-induced dataset shift on-fold cross-validation. IEEE Trans Neural Netw Learn Syst 23(8):1304–1312CrossRefGoogle Scholar
  48. 48.
    Orriols-Puig A, Bernado-Mansilla E (2009) Evolutionary rule-based systems for imbalanced datasets. Soft Comput 13(3):213–225CrossRefGoogle Scholar
  49. 49.
    Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5):341–356CrossRefzbMATHGoogle Scholar
  50. 50.
    Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. In: Solla S, Leen T, Müller K (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 547–553Google Scholar
  51. 51.
    Ramentol E, Vluymans S, Verbiest N, Caballero Y, Bello R, Cornelis C, Herrera F (2015) IFROWANN: imbalanced fuzzy-rough ordered weighted average nearest neighbor classification. IEEE Trans Fuzzy Syst 23(5):1622–1637CrossRefGoogle Scholar
  52. 52.
    Razakarivony S, Jurie F (2016) Vehicle detection in aerial imagery: a small target detection benchmark. J Vis Commun Image Represent 34:187–203CrossRefGoogle Scholar
  53. 53.
    Rokach L (2016) Decision forest: twenty years of research. Inf Fusion 27:111–125CrossRefGoogle Scholar
  54. 54.
    Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci 291:184–203CrossRefGoogle Scholar
  55. 55.
    Sun Y, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recog Artif Intell 23(4):687–719CrossRefGoogle Scholar
  56. 56.
    Verbiest N, Ramentol E, Cornelis C, Herrera F (2014) Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection. Appl Soft Comput 22:511–517CrossRefGoogle Scholar
  57. 57.
    Villar P, Fernández A, Carrasco R, Herrera F (2012) Feature selection and granularity learning in genetic fuzzy rule-based classification systems for highly imbalanced data-sets. Int J Uncertain Fuzz 20(03):369–397CrossRefzbMATHGoogle Scholar
  58. 58.
    Vluymans S, D’eer L, Saeys Y, Cornelis C (2015) Applications of fuzzy rough set theory in machine learning: a survey. Fundam Inform 142(1–4):53–86MathSciNetCrossRefzbMATHGoogle Scholar
  59. 59.
    Vluymans S, Sánchez Tarragó D, Saeys Y, Cornelis C, Herrera F (2016) Fuzzy rough classifiers for class imbalanced multi-instance data. Pattern Recog 53:36–45CrossRefGoogle Scholar
  60. 60.
    Vriesmann LM, Britto AS Jr, Oliveira LES, Koerich AL, Sabourin R (2015) Combining overall and local class accuracies in an oracle-based method for dynamic ensemble selection. In: Proceedings of the 2015 international joint conference on neural networks (IJCNN). IEEE, pp 1–7Google Scholar
  61. 61.
    Wang S, Yao X (2012) Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybern Part B 42(4):1119–1130CrossRefGoogle Scholar
  62. 62.
    Wang S, Chen H, Yao X (2010) Negative correlation learning for classification ensembles. In: Proceedings of the 2010 international joint conference on neural networks (IJCNN). IEEE, pp 1–8Google Scholar
  63. 63.
    Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRefGoogle Scholar
  64. 64.
    Woods K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410CrossRefGoogle Scholar
  65. 65.
    Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNetzbMATHGoogle Scholar
  66. 66.
    Yager R (1988) On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans Syst Man Cybern 18(1):183–190MathSciNetCrossRefzbMATHGoogle Scholar
  67. 67.
    Yijing L, Haixiang G, Xiao L, Yanan L, Jinling L (2016) Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl Based Syst 94:88–104CrossRefGoogle Scholar
  68. 68.
    Yu H, Hong S, Yang X, Ni J, Dan Y, Qin B (2013) Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers. BioMed Res Int 2013:1–13Google Scholar
  69. 69.
    Zadeh LA (1965) Fuzzy sets. Inform Control 8(3):338–353CrossRefzbMATHGoogle Scholar
  70. 70.
    Zhang Z, Krawczyk B, Garcìa S, Rosales-Pérez A, Herrera F (2016) Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data. Knowl Based Syst 106:251–263CrossRefGoogle Scholar
  71. 71.
    Zhao X, Li X, Chen L, Aihara K (2008) Protein classification with imbalanced data. Proteins: Struct Funct Bioinform 70(4):1125–1132CrossRefGoogle Scholar
  72. 72.
    Zhou Z, Liu X (2010) On multi-class cost-sensitive learning. Comput Intell 26(3):232–257MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.Department of Applied Mathematics, Computer Science and StatisticsGhent UniversityGhentBelgium
  2. 2.Data Mining and Modeling for Biomedicine, VIB Inflammation Research CenterGhentBelgium
  3. 3.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain
  4. 4.Faculty of Computing and Information TechnologyKing Abdulaziz UniversityJeddahSaudi Arabia

Personalised recommendations