Towards a Linear Combination of Dichotomizers by Margin Maximization

  • Claudio Marrocco
  • Mario Molinara
  • Maria Teresa Ricamato
  • Francesco Tortorella
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5716)


When dealing with two-class problems the combination of several dichotomizers is an established technique to improve the classification performance. In this context the margin is considered a central concept since several theoretical results show that improving the margin on the training set is beneficial for the generalization error of a classifier. In particular, this has been analyzed with reference to learning algorithms based on boosting which aim to build strong classifiers through the combination of many weak classifiers. In this paper we try to experimentally verify if the margin maximization can be beneficial also when combining already trained classifiers. We have employed an algorithm for evaluating the weights of a linear convex combination of dichotomizers so as to maximize the margin of the combination on the training set. Several experiments performed on publicly available data sets have shown that a combination based on margin maximization could be particularly effective if compared with other established fusion methods.


Multiple Classifier Systems Two-class classification Margins Linear Combination 


  1. 1.
    Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)CrossRefzbMATHGoogle Scholar
  2. 2.
    Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)Google Scholar
  3. 3.
    Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Breiman, L.: Prediction games and arcing algorithms. Neural Computation 11(7), 1493–1517 (1999)CrossRefGoogle Scholar
  5. 5.
    Crammer, K., Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin analysis of the LVQ algorithm. In: Advances in NIPS, vol. 15, pp. 462–469 (2003)Google Scholar
  6. 6.
    Fumera, G., Roli, F.: A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 942–956 (2005)CrossRefGoogle Scholar
  7. 7.
    Grove, A.J., Schuurmans, D.: Boosting in the limit: maximizing the margin of learned ensembles. In: Proc. AAAI 1998/IAAI 1998, pp. 692–699. American Association for Artificial Intelligence (1998)Google Scholar
  8. 8.
    Jain, A.K., Nandakumar, K., Ross, A.: Score normalization in multimodal biometric systems. Pattern Recognition 38(12), 2270–2285 (2005)CrossRefGoogle Scholar
  9. 9.
    Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Annals of Statistics 30, 1–50 (2002)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Kuncheva, L.I.: Combining Pattern Classifiers. Methods and Algorithms. John Wiley & Sons, Chichester (2004)CrossRefzbMATHGoogle Scholar
  11. 11.
    Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles. Machine Learning 51, 181–207 (2003)CrossRefzbMATHGoogle Scholar
  12. 12.
    Marrocco, C., Ricamato, M.T., Tortorella, F.: Exploring margin maximization for biometric score fusion. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 674–683. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Ratsch, G., Onoda, T., Muller, K.R.: Soft margins for adaboost. Machine Learning 42(3), 287–320 (2001); NeuroCOLT Technical Report NC-TR-1998-021Google Scholar
  14. 14.
    Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44(5), 1926–1940 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Tumer, K., Ghosh, J.: Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition 29, 341–348 (1996)CrossRefGoogle Scholar
  17. 17.
    Vanderbei, R.J.: Linear Programming: Foundations and Extensions, 2nd edn. Springer, Heidelberg (2001)CrossRefzbMATHGoogle Scholar
  18. 18.
    Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, Chichester (1998)zbMATHGoogle Scholar
  19. 19.
    Vezhnevets, A., Vezhnevets, V.: Modest adaboost - teaching AdaBoost to generalize better. In: Graphicon-2005, Novosibirsk Akademgorodok, Russia (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Claudio Marrocco
    • 1
  • Mario Molinara
    • 1
  • Maria Teresa Ricamato
    • 1
  • Francesco Tortorella
    • 1
  1. 1.DAEIMIUniversità degli Studi di CassinoCassinoItaly

Personalised recommendations