Skip to main content
Log in

Comparing Combination Rules of Pairwise Neural Networks Classifiers

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

A decomposition approach to multiclass classification problems consists in decomposing a multiclass problem into a set of binary ones. Decomposition splits the complete multiclass problem into a set of smaller classification problems involving only two classes (binary classification: dichotomies). With a decomposition, one has to define a recombination which recomposes the outputs of the dichotomizers in order to solve the original multiclass problem. There are several approaches to the decomposition, the most famous ones being one-against-all and one-against-one also called pairwise. In this paper, we focus on pairwise decomposition approach to multiclass classification with neural networks as the base learner for the dichotomies. We are primarily interested in the different possible ways to perform the so-called recombination (or decoding). We review standard methods used to decode the decomposition generated by a one-against-one approach. New decoding methods are proposed and compared to standard methods. A stacking decoding is also proposed which consists in replacing the whole decoding or a part of it by a trainable classifier to arbiter among the conflicting predictions of the pairwise classifiers. Proposed methods try to cope with the main problem while using pairwise decomposition: the use of irrelevant classifiers. Substantial gain is obtained on all datasets used in the experiments. Based on the above, we provide future research directions which consider the recombination problem as an ensemble method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allwein E, Schapire R and Singer Y (2000). Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Rese 1(2): 113–141

    Article  MathSciNet  Google Scholar 

  2. Bradley RA and Terry ME (1952). The rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39: 324–345

    MATH  MathSciNet  Google Scholar 

  3. Campbell C (1997). Constructive learning techniques for designing neural network systems. Academic Press, San Diego

    Google Scholar 

  4. Cardot H, Lezoray O (2002) Graph of neural networks for pattern recognition. In: International Conference on Pattern Recognition (ICPR) vol 2. pp. 124–127.

  5. Crammer K and Singer Y (2002). On the learnability and design of output codes for multiclass problems. Mach Learn 47(2–3): 201–233

    Article  MATH  Google Scholar 

  6. Dietterich T and Bakiri G (1995). Solving Multiclass learning problems via error-coorecting output codes. J Artif Intelli Res 2: 263–286

    MATH  Google Scholar 

  7. Ding C and Dubchak I (2001). Multiclass protein fold recognition using support vector machines and neural networks. Bioinformatics 17: 349–358

    Article  Google Scholar 

  8. Duan K, Keerthi S (2005) Which is the best Multiclass SVM method? An empirical study. In: International Workshop on Multiple Classifier Systems. pp 278–285

  9. Frank E, Kramer S (2004) Ensembles of nested dichotomies for multi-class problems. In: ICML. pp 305–312

  10. Friedman J (1996) Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University

  11. Furnkranz J. (2002) Pairwise classification as an ensemble technique. In: European Conference on Machine Learning (ECML). pp 97–110

  12. Furnkranz J (2002). Round Robin Classification. J Mach Learn Res 2: 721–747

    Article  MathSciNet  Google Scholar 

  13. Hastie T and Tibshirani R (1998). Classification by pairwise coupling. Ann Stat 26(2): 451–471

    Article  MATH  MathSciNet  Google Scholar 

  14. Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases. Technical report, University of California, Irvine, Department of Information and Computer Sciences

  15. Hsu C-W and Lin C-J (2002). A comparison of methods for multi-class support vector machines. IEEE Transa Neural Netw 13(2): 415–425

    Article  Google Scholar 

  16. Ie E, Weston J, Noble W, Leslie C (2005) Multi-class protein fold recognition using adaptive codes. In: International Conference on Machine Learning (ICML). pp 329–336

  17. Klautau A, Jevtić N, Orlitsky A (2002) Combined binary classifiers with applications to speech recognition. In: International Conference on Spoken Language Processing (ICSLP). pp 2469–2472

  18. Klautau A, Jevtić N and Orlitsky A (2003). On nearest neighbor error-correcting output codes with application to all-pairs multiclass support vector machnies. J Mach Learn Rese 4: 1–15

    Article  Google Scholar 

  19. Knerr S, Personnaz L and Dreyfus G (1992). Handwritten digit recognition by neural networks with single-layer training. IEEE Trans Neural Netw 3(6): 962–968

    Article  Google Scholar 

  20. Ko J, Kim E, Byun H (2004) Improved N-division output coding for multiclass learning problems. In: International Conference on Pattern Recognition (ICPR), vol. 3. pp 470– 473

  21. Kressel U (1999) Advances in Kernel methods, support vector learning, Chapt. Pairwise classification and support vector machines. MIT Press

  22. Kwok T-Y and Yeung D-Y (1997). Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Trans Neural Netw 8(3): 630–645

    Article  Google Scholar 

  23. Lezoray O and Cardot H (2001). A neural network architecture for data classification. Int J Neural Syst 11(1): 33–42

    Google Scholar 

  24. Lezoray O, Cardot H (2005) Combining multiple pairwise neural networks classifiers: a comparative study. In: International Workshop on Artificial Neural Networks and Intelligent Information Processing (ANNIIP). pp 52–61

  25. Lezoray O, Elmoataz A and Cardot H (2003). A color object recognition scheme: application to cellular sorting. Mach Vis Appl 14: 166–171

    Google Scholar 

  26. Lezoray O, Fournier D and Cardot H (2004). Neural network induction graph for pattern recognition. Neurocomputing C(57): 257–274

    Article  Google Scholar 

  27. Lu B-L and Ito M (1999). Task decomposition and module combination based on class relations: a modular neural network for pattern classification. IEEE Trans Neural Netw 10(5): 1244–1256

    Article  Google Scholar 

  28. Masulli F, Valentini G (2000) Comparing decomposition methods for classification. In: International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies, vol 2. pp 788–792

  29. Mayoraz E, Alpaydin E (1999) Support vector machines for multi-class classification. In: International Work conference on Artificial Neural Networks, vol 2. pp 833–842

  30. Moreira M, Mayoraz E (1998) Improved pairwise coupling classification with correcting classifiers. In: European Conference on Machine Learning (ECML). pp 160–171

  31. Ou G, Murphey Y, Feldkamp A (2004) Multiclass pattern classification using neural networks. In: International Conference on Pattern Recognition (ICPR) vol. 4. pp 585–588

  32. Phetkaew T, Kijsirikul B, Rivepiboon W (2003) Reordering adaptive directed acyclic graphs: an improved algorithm for multiclass support vector machines. In: International Joint Conference on Neural Networks (IJCNN), vol 2. pp 1605–1610

  33. Platt J, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. In: Advances in Neural Information Processing Systems (NIPS), vol 12. pp 547–553

  34. Price D, Knerr S, Personnaz L (1995) Pairwise neural network classifiers with propabilistic outputs. In: Advances in Neural Information Processing Systems (NIPS), vol 7. pp 1109–116

  35. Quinlan J (1993). C4.5: programs for machine learning. Morgann Kauffman, San Mateo

    Google Scholar 

  36. Rifkin R and Klautau A (2004). In defense of one-vs-all classification. J Mach Learn Rese 5: 101–141

    MathSciNet  Google Scholar 

  37. Savicky P, Furnkranz J (2003) Combining pairwise classifiers with stacking. In: Intellignent Data Analysis (IDA). pp 219–229

  38. Takahashi F, Abe S (2003) Optimizing directed acyclic graph support vector machines. In: Artificial Neural Networks in Pattern Recognition (ANNPR). pp 166–173

  39. Tax D, Duin R (2002) Using two-class classifiers for multiclass classification. In: International Conference on Pattern Recognition (ICPR), vol 2. pp 124–127

  40. Vural V, Dy JG (2004) A hierarchical method for multi-class support vector machines. In: International Conference on Machine Learning (ICML). p 105

  41. Wolpert D (1992). Stacked generalization. Neural Netw 5(2): 241–260

    Article  Google Scholar 

  42. Zhou Z-H and Jiang Y (2003). NeC.45: neural ensemble based C.45. IEEE Trans Knowl Data Eng 16(6): 770–773

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olivier Lézoray.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lézoray, O., Cardot, H. Comparing Combination Rules of Pairwise Neural Networks Classifiers. Neural Process Lett 27, 43–56 (2008). https://doi.org/10.1007/s11063-007-9058-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-007-9058-5

Keywords

Navigation