Building ensemble classifiers using belief functions and OWA operators

Abstract

A pervasive task in many forms of human activity is classification. Recent interest in the classification process has focused on ensemble classifier systems. These types of systems are based on a paradigm of combining the outputs of a number of individual classifiers. In this paper we propose a new approach for obtaining the final output of ensemble classifiers. The method presented here uses the Dempster–Shafer concept of belief functions to represent the confidence in the outputs of the individual classifiers. The combing of the outputs of the individual classifiers is based on an aggregation process which can be seen as a fusion of the Dempster rule of combination with a generalized form of OWA operator. The use of the OWA operator provides an added degree of flexibility in expressing the way the aggregation of the individual classifiers is performed.

This is a preview of subscription content, access via your institution.

References

  1. Ahmadzadeh MR, Petrou M (2003) Use of Dempster–Shafer theory to combine classifiers which use different class boundaries. Pattern Anal Appl 6:41–46

    MATH  Article  MathSciNet  Google Scholar 

  2. Al-Ani A, Deriche M (2002) A new technique for combining multiple classifiers using the Dempster–Shafer theory of evidence. J Artif Intell Res 17:333–361

    MATH  MathSciNet  Google Scholar 

  3. Ali K (1995) A comparison of methods for learning and combining evidence from multiple models. Technical Report 95–47, Dept. of Information and Computer Science, University of California, Irvine

  4. Ali K, Pazzani M (1996) Error reduction through learning multiple descriptions. Mach Learn 24:173–202

    Google Scholar 

  5. Altincay H (2005) A Dempster–Shafer theoretic framework for boosting based ensemble design. Pattern Anal Appl 8:287–302

    Article  MathSciNet  Google Scholar 

  6. Altincay H (2006) On the independence requirement in Dempster– Shafer theory for combining classifiers providing statistical evidence. Appl Intell 25:73–90

    MATH  Article  Google Scholar 

  7. Altincay H, Demirekler M (2003) Speaker identification by combining multiple classifiers using Dempster–Shafer theory of evidence. Speech Commun 41:531–547

    Article  Google Scholar 

  8. Binaghi E, Madella P (1999) Fuzzy Dempster–Shafer reasoning for rule-based classifiers. Int J Intell Syst 14:559–583

    MATH  Article  Google Scholar 

  9. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    MATH  MathSciNet  Google Scholar 

  10. Buntine W (1990) A theory of learning classification rules. Ph.D. Dissertation, University of Technology, Sydney, Australia

  11. Cios KJ, Pedrycz W, Swiniarski RW (1998) Data mining methods for knowledge discovery. Kluwer, Boston

    Google Scholar 

  12. Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27

    MATH  Article  Google Scholar 

  13. Dempster AP, Yager RR, Liu L (2007) Classic works on the Dempster– Shafer theory of belief functions. Studies in fuzziness & soft computing, vol. 219. Springer, Heidelberg

    Google Scholar 

  14. Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25(5):804–813

    Article  Google Scholar 

  15. Denoeux T (1997) Analysis of evidence-theoretic decision rules for pattern classification. Pattern Recognit 30(7):1095–1107

    Article  Google Scholar 

  16. Denoeux T (2000) A neural network Classifier Based on Dempster–Shafer Theory. IEEE Trans Syst Man Cybern A 30(2):131–150

    Article  MathSciNet  Google Scholar 

  17. Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) First international workshop on multiple classifier systems. Lecture Notes in Computer Science. Springer, New York, pp 1–15

    Google Scholar 

  18. Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley Interscience, New York

    Google Scholar 

  19. Dunham M (2003) Data mining. Prentice Hall, Upper Saddle River

    Google Scholar 

  20. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 148–156

  21. Han J, Kamber M (2001) Data mining: Concepts And Techniques. Morgan Kaufmann, San Francisco

    Google Scholar 

  22. Ho TK, Hull JJ, Srihari SS (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16:66–75

    Article  Google Scholar 

  23. Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20:226–239

    Article  Google Scholar 

  24. Klement EP, Mesiar R, Pap E (2000) Triangular norms. Kluwer, Dordrecht

    Google Scholar 

  25. Kononenko M, Kovacic M (1992) Learning as optimization: stochastic generation of multiple knowledge. In: Proceedings of 9th international workshop on machine learning, Aberdeen, UK, Morgan Kaufmann, pp 257–262

  26. Kramosil I (2001) Dempster combination rule with boolean-like processed belief functions. Int J Uncertain Fuzziness Knowl Based Syst 9(1):105–121

    MATH  MathSciNet  Google Scholar 

  27. Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, Hoboken

    Google Scholar 

  28. Kwok S, Carter C (1990) Multiple Decision Trees. Uncertain Artif Intell 4:327–335

    Google Scholar 

  29. Laha A, Pal NR, Das J (2006) Land cover classification using fuzzy rules and aggregation of contextual information through evidence theory. IEEE Trans Geosci Remote Sens 44(6):1633–1641

    Article  Google Scholar 

  30. Lin TS, Yao YY, Zadeh LA (2002) Data mining, rough sets and granular computing. Physica-Verlag, Heidelberg

    Google Scholar 

  31. Mandler E, Schurmann J (1988) Combining the classification results of independent classifiers based on the Dempster–Shafer theory of evidence. In: Gelsema E, Kanal L (eds) Pattern recognition and artificial intelligence, pp 381–393

  32. O’Hagan M (1990) Using maximum entropy-ordered weighted averaging to construct a fuzzy neuron. In: Proceedings 24th Annual IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, pp 618–623

  33. Reformat M (2005) A fuzzy-based meta-model system for reasoning about the number of software defects. Int J Intell Syst 20:1093–1115

    MATH  Article  Google Scholar 

  34. Rogova G (1994) Combining the results of several neural network classifiers. Neural Netw 7:777–781

    Article  Google Scholar 

  35. Roli F (2006) A gentle introduction to fusion of multiple pattern classifiers, in data fusion for situation monitoring, incident detection, alert and response management. In: Shahbazian E, Rogova G, Valin P (eds) IOS NATO Publication, Amsterdam, pp 23–34

  36. Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton

    Google Scholar 

  37. Smets Ph (1988) Belief functions. In: Smets Ph, Mamdani A, Dubois D, Prade H (eds) Non standard logics for automated reasoning. Academic, London, pp 253–286

    Google Scholar 

  38. Smets Ph, Kennes R (1994) The transferable belief model. Artif Intell 66:191–234

    MATH  Article  MathSciNet  Google Scholar 

  39. Todorovski L, Dzeroski S (2000) Combining multiple models with meta decision trees. In: Proceedings of the 4th European conference on principles of data mining and knowledge discovery. Springer, Heidelberg, pp 54–64

  40. Winer BJ, Brown DR, Michels KM (1991) Statistical principles in experimental design. McGraw-Hill, New York

    Google Scholar 

  41. Yager RR (1993) Families of OWA operators. Fuzzy Sets Syst 59:125–148

    MATH  Article  MathSciNet  Google Scholar 

  42. Yager RR (1996) Quantifier guided aggregation using OWA operators. Int J Intell Syst 11:49–73

    Article  Google Scholar 

  43. Yager RR (1988) On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Trans Syst Man Cybern 18:183–190

    MATH  Article  MathSciNet  Google Scholar 

  44. Yager RR (2005) Extending multicriteria decision making by mixing t-norms and OWA operators. Int J Intell Syst 20:453–474

    MATH  Article  Google Scholar 

  45. Yager RR (2006) Generalized naive Bayesian modeling. Inf Sci 176:577–588

    Article  MathSciNet  Google Scholar 

  46. Zadeh LA (1983) A computational approach to fuzzy quantifiers in natural languages. Comput Math Appl 9:149–184

    MATH  Article  MathSciNet  Google Scholar 

  47. Zouhal LM, Denoeux T (1998) An evidence-theoretic k-NN rule with parameter optimization. IEEE Trans Syst Man Cybern C 28(2):263–271

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ronald R. Yager.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Reformat, M., Yager, R.R. Building ensemble classifiers using belief functions and OWA operators. Soft Comput 12, 543–558 (2008). https://doi.org/10.1007/s00500-007-0227-2

Download citation

Keywords

  • Ensemble systems
  • Rule-based models
  • Belief functions
  • Dempster–Shafer evidence theory
  • Ordered weighted averaging operator