Learning from Imbalanced Datasets with Cross-View Cooperation-Based Ensemble Methods

  • Cécile Capponi
  • Sokol Koço
Part of the Unsupervised and Semi-Supervised Learning book series (UNSESUL)


In this paper, we address the problem of learning from imbalanced multi-class datasets in a supervised setting when multiple descriptions of the data—also called views—are available. Each view incorporates various information on the examples, and in particular, depending on the task at hand, each view might be better at recognizing only a subset of the classes. Establishing a sort of cooperation between the views is needed for all the classes to be equally recognized—a crucial problem particularly for imbalanced datasets. The novelty of our work consists in capitalizing on the complementariness of the views so that each class can be processed by the most appropriate view(s), thus improving the per-class performances of the final classifier. The main contribution of this paper are two ensemble learning methods based on recent theoretical works on the use of the confusion matrix’s norm as an error measure, while empirical results show the benefits of the proposed approaches.



This work is partially funded by the French ANR project LIVES ANR-15-CE23-0026.


  1. 1.
    Ayache, S., Quénot, G., Gensel, J.: Classifier fusion for SVM-based multimedia semantic indexing. In: ECIR, pp. 494–504 (2007)Google Scholar
  2. 2.
    Bach, F.R., Lanckriet, G.R.G.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21st International Conference on Machine Learning (ICML) (2004)Google Scholar
  3. 3.
    Bi, J., Zhang, C.: An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl.-Based Syst. (2018). CrossRefGoogle Scholar
  4. 4.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)zbMATHGoogle Scholar
  5. 5.
    Fernández, A., Garcia, S., Herrera, F., Chawla, N.V.: Smote for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the International Conference on Machine Learning, pp. 148–156 (1996)Google Scholar
  7. 7.
    García, S., Zhang, Z.L., Altalhi, A., Alshomrani, S., Herrera, F.: Dynamic ensemble selection for multi-class imbalanced datasets. Inf. Sci. 445–446, 22–37 (2018)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 221–228 (2009)Google Scholar
  9. 9.
    Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)CrossRefGoogle Scholar
  10. 10.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  11. 11.
    Huusari, R., Kadri, H., Capponi, C.: Multiview metric learning in vector-valued kernel spaces. In: Aistats, 2018, Lanzarote (2018)Google Scholar
  12. 12.
    Kadri, H., Ayache, S., Capponi, C., Koço, S., Dupé, F.X., Morvant, E.: The multi-task learning view of multimodal data. In: Asian Conference on Machine Learning, JMLR, pp. 261–276 (2013)Google Scholar
  13. 13.
    Koço, S., Capponi, C.: A boosting approach to multiview classification with cooperation. In: European Conference on Machine Learning (ECML), vol. 6912, pp. 209–228 (2011)Google Scholar
  14. 14.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009), pp. 951–958 (2009)Google Scholar
  15. 15.
    Morvant, E., Koço, S., Ralaivola, L.: PAC-Bayesian generalization bound on confusion matrix for multi-class classification. In: International Conference on Machine Learning, pp. 815–822 (2012)Google Scholar
  16. 16.
    Mukherjee, I., Schapire, R.E.: A theory of multiclass boosting. J. Mach. Learn. Res. 14(1), 437–497 (2013)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Ralaivola, L.: Confusion-based online learning and a passive-aggressive scheme. In: Neural Information Processing Systems Conference (2012)Google Scholar
  18. 18.
    Snoek, C., Worring, M., Smeulders, A.: Early versus late fusion in semantic video analysis. In: Proceedings of the 13th Annual ACM International Conference on Multimedia (MULTIMEDIA ’05), pp. 399–402. ACM, New York (2005)Google Scholar
  19. 19.
    Sonnenburg, S., Raetsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., Franc, V.: The shogun machine learning toolbox. J. Mach. Learn. Res. 11, 1799–1802 (2010)zbMATHGoogle Scholar
  20. 20.
    Sun, Y., Kamel, M., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: 2006 IEEE International Conference on Data Mining, HongKong, pp. 592–602 (2006)Google Scholar
  21. 21.
    Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. B Cybern. 42(4), 1119–1130 (2012)CrossRefGoogle Scholar
  22. 22.
    Wu, F., Jing, X.Y., Shan, S., Zuo, W., Yang, J.Y.: Multiset feature learning for highly imbalanced data classification. In: Proceedings of AAAI, pp. 1583–1589 (2017)Google Scholar
  23. 23.
    Yijing, L., Haixiang, G., Xiao, L., Yanan, L., Jinling, L.: Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl.-Based Syst. 94, 88–104 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Cécile Capponi
    • 1
  • Sokol Koço
    • 1
  1. 1.LISAix-Marseille University, Toulon University, CNRSMarseilleFrance

Personalised recommendations