ICANN 98 pp 693-698 | Cite as

Multivariate Linear Regression on Classifier Outputs: a Capacity Study

  • Yann Guermeur
  • Hélène Paugam-Moisy
  • Patrick Gallinari
Part of the Perspectives in Neural Computing book series (PERSPECT.NEURAL)


We consider the problem of combining the outputs of severed classifiers trained independently to perform a discrimination task, in order to improve the prediction accuracy of individual classifiers. We briefly describe the multivariate linear regression model which has already been implemented successfully for that purpose and we study its capacity, using generalizations of the notion of VC dimension.


Discriminant Function Multivariate Linear Regression Gradient Projection Method Structural Risk Minimization Convex Objective 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Bates, J.M. and Granger, C.W.J. The Combination of Forecasts, Opl Res. Q., 1969, Vol. 20, 451–468.CrossRefGoogle Scholar
  2. [2]
    Xu, L., Krzyzak, A. and Suen, C.Y. Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition, IEEE Trans, on Systems, Man, and Cybernetics, 1992, vol. 22, 418–435.CrossRefGoogle Scholar
  3. [3]
    Breiman, L. Stacked Regressions. Machine LearningStacked Regressions. Machine Learning, 1996, vol. 24, 49–64.MathSciNetMATHGoogle Scholar
  4. [4]
    Freund, Y. and Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. EuroCOLT’95, 1995, 23–37.Google Scholar
  5. [5]
    Sollich, P. and Krogh, A. Learning with ensembles: How over-fitting can be useful. NIPS’8, 1996, 190–196.Google Scholar
  6. [6]
    Guermeur, Y., d’Alché-Buc, F. and Gallinari, P. Optimal Linear Regression on Classifier Outputs, ICANN’97, 1997, 481–486.Google Scholar
  7. [7]
    Guermeur, Y. An Ensemble Method for Protein Secondary Structure Prediction. Submitted to the Journal of Computational Biology, 1998.Google Scholar
  8. [8]
    Genest, C. and McConway, K.J. Allocating the Weights in the Linear Opinion Pool. Journal of Forecasting, vol.9, 53–73, 1990.CrossRefGoogle Scholar
  9. [9]
    Vapnik, V.N. The Nature of Statistical Learning Theory. Springer, N.Y., 1995.MATHCrossRefGoogle Scholar
  10. [10]
    Rosen, J.B. The Gradient Projection Method for Nonlinear Programming. Part I. Linear Constraints. J. SIAM, 1960, vol. 8, N° 1, 181–217.MATHGoogle Scholar
  11. [11]
    Guermeur, Y. Combinaison de classifieurs statistiques, application à la prédiction de la structure secondaire des protéines. PhD thesis, Univ. Paris 6, 1997.Google Scholar
  12. [12]
    Ben-David, S., Cesa-Bianchi, N., Haussler, D. and Long, P.M. Characterizations of Learnability for Classes of {0,…}, n-Valued Functions. Journal of Computer and System Sciences, 1995, 50, 74–86.MathSciNetMATHCrossRefGoogle Scholar
  13. [13]
    Valiant, L.G. A Theory of the Learnable. Communications of the ACM, 1984, vol. 27, 1100–1134.CrossRefGoogle Scholar
  14. [14]
    Natarajan, B.K. On learning Sets and Functions. Machine Learning, 1989, 4, 67–97.Google Scholar

Copyright information

© Springer-Verlag London 1998

Authors and Affiliations

  • Yann Guermeur
    • 1
  • Hélène Paugam-Moisy
    • 1
  • Patrick Gallinari
    • 2
  1. 1.LIP, URA CNRS 1398ENS LyonLyon Cedex 07France
  2. 2.LIP6, UMR CNRS 7606Université Paris 6Paris Cedex 05France

Personalised recommendations