Skip to main content
Log in

A new hybrid classifier selection model based on mRMR method and diversity measures

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Classifier subset selection becomes an important stage in multiple classifier systems (MCSs) design to reduce the number of classifiers by eliminating the identical and inaccurate members. Minimum redundancy maximum relevance (mRMR) is a feature selection method that compromises between relevance and redundancy by obliterating similar members and keeping the most pertinent ones. In the current work, a novel classifier subset selection method based on mRMR method and diversity measures is proposed for building an efficient classifier ensemble. The proposed selection model suggested the greedy search algorithm using diversity-accuracy criteria to determine the optimal classifier set. The disagreement and Q-statistic measures are calculated to estimate the diversity among the members. Furthermore, the relevance is used as a means to determine the accuracy of the ensemble and its members. The experimental results over 24 datasets from the UCI repository and Kuncheva collection for real datasets are tested. The results established the efficiency of the proposed selection method with superior performance compared to the popular ensembles and several selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wang R, Wang XZ, Kwong S et al (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEEE Trans Fuzzy Syst 25:1460–1475

    Article  Google Scholar 

  2. Wang XZ, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48:703–715

    Google Scholar 

  3. Chan PP, Yeung DS, Ng WW, Lin CM, Liu JN (2012) Dynamic fusion method using localized generalization error model. Inf Sci 217:1–20

    Article  Google Scholar 

  4. Azizi N, Farah N, Sellami M (2010) Off-line handwritten word recognition using ensemble of classifier selection and features fusion. J Theoret Appl Inf Technol 14:141–150

    Google Scholar 

  5. Kuncheva LI (2003) That elusive diversity in classifier ensembles, pattern recognition and image analysis, vol 2652. Springer, Berlin, pp 1126–1138

    Book  Google Scholar 

  6. Cheriguene S, Azizi N, Zemmal N, Dey N, Djellali H, Farah N (2016) Optimized tumor breast cancer classification using combining random subspace and static classifiers selection paradigms. In: Applications of intelligent optimization in biology and medicine. Springer, Berlin, pp 289–307

    Chapter  Google Scholar 

  7. Rahman A, Tasnim S (2014) Ensemble classifiers and their applications: a review. Int J Comput Trends Technol IJCTT 10:31–35

    Article  Google Scholar 

  8. Breiman L (1996) Bagging predictors. Mach Learn J 24:123–140

    MathSciNet  MATH  Google Scholar 

  9. Freund Y, Schapire RE (1996) Experiments with a new Boosting algorithm. In: 13th international conference on machine learning, Bari, Italy, pp 148–156

  10. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844

    Article  Google Scholar 

  11. Wang G, Zhang Z, Sun J, Yang S, Larson CA (2015) POS-RS: a random subspace method for sentiment classification based on part-of-speech analysis. Inf Process Manag 51:458–479

    Article  Google Scholar 

  12. Álvarez A, Sierra B, Arruti A (2015) Classifier subset selection for the stacked generalization method applied to emotion recognition in speech. Sensors 16:21

    Article  Google Scholar 

  13. Kuncheva L (2000) Clustering-and-selection model for classifier combination. In: Proceedings of the fourth international conference on knowledge-based intelligent engineering systems and allied technologies, Brighton, UK, pp 185–188

  14. Azizi N, Farah N (2012) From static to dynamic ensemble of classifiers selection: application to Arabic handwritten recognition. Int J Knowl Based Intell Eng Syst 16:279–288

    Article  Google Scholar 

  15. Aksela M, Laaksonen J (2006) Using diversity of errors for selecting members of a committee classifier. Pattern Recogn 39:608–623

    Article  MATH  Google Scholar 

  16. Yang L (2011) Procedia engineering classifiers selection for ensemble learning based on accuracy and diversity. Proced Eng 15:4266–4270

    Article  Google Scholar 

  17. Mendialdua I, Arruti A, Jauregi E, Lazkano E, Sierra B (2015) Classifier subset selection to construct multi-classifiers by means of estimation of distribution algorithms. Neurocomputing 157:46–60

    Article  Google Scholar 

  18. Visentini I, Snidaro L, Foresti GL (2016) Diversity-aware classifier ensemble selection via f-score. Inf Fusion 28:24–43

    Article  Google Scholar 

  19. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238

    Article  Google Scholar 

  20. Paninski L (2003) Estimation of entropy and mutual information. Neural Comput 15:1191–1253

    Article  MATH  Google Scholar 

  21. El A, Aouatif A, El A, Driss O (2011) A two-stage gene selection scheme utilizing MRMR filter and GA wrapper. Knowl Inf Syst 26:487–500

    Article  Google Scholar 

  22. Li A, Hu L, Niu S, Cai Y, Chou K (2012) Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches. J Proteom 75:1654–1665

    Article  Google Scholar 

  23. Cheriguene S, Azizi N, Dey N (2016) Ensemble classifiers construction using diversity measures and random subspace algorithm combination: application to glaucoma diagnosis. In: Medical imaging in clinical applications, Springer, Cham, pp 131–152

  24. Cheriguene S, Azizi N, Dey N, Ashour AS, Corina N, Shi F (2016) Classifier ensemble selection based on MRMR algorithm and diversity: an application of medical data classification. In: Proceedings of the 7th International Workshop Soft Computing Applications, Arad, Romania, pp 375–384

  25. Gacquer D, Delcroix V, Delmotte F, Piechowiak S (2009) On the effectiveness of diversity when training multiple classifier systems. In: European conference on symbolic and quantitative approaches to reasoning and uncertainty, vol 5590. Springer, Verona, Italy, pp 493–504

  26. Parvin H, Minaei-bidgoli B, Shahpar H (2011) Classifier selection by clustering. In: Mexican conference on pattern recognition. Springer, Cancun, Mexico, pp 60–66

  27. Mao S, Jiao LC, Xiong L, Gou S (2011) Greedy optimization classifiers ensemble based on diversity. Pattern Recognit 44:1245–1261

    Article  MATH  Google Scholar 

  28. Strehl A, Ghosh J (2002) Cluster ensembles––a knowledge reuse framework for combining partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  29. Liu H, Liu T, Wu J, Tao D, Fu Y (2015) Spectral ensemble clustering. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, pp 715–724

  30. Singh V, Mukherjee L, Peng J, Xu J (2010) Ensemble clustering using semi definite programming with applications. Mach Learn 79:177–200

    Article  MathSciNet  Google Scholar 

  31. Huang D, Lai J-H, Wang C-D (2016) Ensemble clustering using factor graph. Pattern Recognit 50:131–142

    Article  MATH  Google Scholar 

  32. Huang D, Lai J-H, Wang C-D (2016) Robust ensemble clustering using probability trajectories. IEEE Trans Knowl Data Eng 28:1312–1326

    Article  Google Scholar 

  33. Fern X, Brodley C (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on machine learning, Banff, Alberta, Canada, pp 36

  34. Huang D, Lai J-H, Wang C-D (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250

    Article  Google Scholar 

  35. Huang D, Wang C-D, Lai J-H (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 51:1–14

    Google Scholar 

  36. Kumari P, Member S, Vaish A (2015) Information-theoretic measures on intrinsic mode function for the individual identification using EEG sensors. IEEE Sens J 15:4950–4960

    Article  Google Scholar 

  37. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423

    Article  MathSciNet  MATH  Google Scholar 

  38. Wang XZ, Zhang T, Wang R (2017) Non-iterative deep learning: incorporating restricted Boltzmann machine into multilayer random weight neural networks. IEEE Trans Syst Man Syst 1–10

  39. Leslie CS, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. Biocomputing 7:564–575

    Google Scholar 

  40. Quinlan JR (1996) Bagging boosting and C4.5. In: Proceedings of the thirteenth national conference on artificial intelligence, vol 2, Portland, Oregon, pp 725–730

  41. Li H, Wen G, Yu Z, Zhou T (2013) Random subspace evidence classifier. Neurocomputing 110:62–69

    Article  Google Scholar 

  42. Li N, Yu Y, Zhou Z (2012) Diversity regularized ensemble pruning. In: Joint European conference on machine learning and knowledge discovery in databases, Bristol, UK, pp 330–345

  43. Krawczyk B (2016) Untrained weighted classifier combination with embedded ensemble pruning. Neurocomputing 196:14–22

    Article  Google Scholar 

  44. Cheriguene S, Azizi N, Farah N, Ziani A (2016) A two stage classifier selection ensemble based on mRMR algorithm and diversity measures. In: Computing systems and applications conference, Algiers, Algeria

  45. Azizi N, Farah N, Sellami M, Ennaji A (2010) Using diversity in classifier set selection for Arabic handwritten recognition. Mult Classif Syst:235–244

  46. Moreno–Seco F, Iñesta MJ, León PJ, Micó L (2006) Comparison of classifier fusion methods for classification in pattern recognition tasks. Lect Notes Comput Sci 4109:705–713

    Article  Google Scholar 

  47. Asuncion A, Newman DJ (2007) UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets.html. Accessed 4 May 2015

  48. Kuncheva L (2004) Ludmila Kuncheva collection. http://pages.bangor.ac.uk/~mas00a/activities/real_data.html. Accessed 23 Apr 2015

  49. Witten HI, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Burlington

    Google Scholar 

  50. Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka. Accessed 19 Apr 2016

  51. Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. In: Proceedings of the 14th international conference on machine learning, Nashville, TN, USA, pp 378–387

  52. Kuncheva LI (2013) A bound on kappa-error diagrams for analysis of classifier ensembles. IEEE Trans Knowl Data Eng 25:494–501

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soraya Cheriguene.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheriguene, S., Azizi, N., Dey, N. et al. A new hybrid classifier selection model based on mRMR method and diversity measures. Int. J. Mach. Learn. & Cyber. 10, 1189–1204 (2019). https://doi.org/10.1007/s13042-018-0797-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-018-0797-6

Keywords

Navigation