A Directed Inference Approach towards Multi-class Multi-model Fusion

  • Tianbao Yang
  • Lei Wu
  • Piero P. Bonissone
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7872)


In this paper, we propose a directed inference approach for multi-class multi-model fusion. Different from traditional approaches that learn a model in training stage and apply the model to new data points in testing stage, directed inference approach constructs (one) general direction of inference in training stage, and constructs an individual (ad-hoc) rule for each given test point in testing stage. In the present work, we propose a framework for applying the directed inference approach to multiple model fusion problems that consists of three components: (i) learning of individual models on the training samples, (ii) nearest neighbour search for constructing individual rules of bias correction, and (iii) learning of an optimal combination weights of individual models for model fusion. For inference on a test sample, the prediction scores of individual models are first corrected with bias estimated from the nearest training data points, and then the corrected scores are combined using the learned optimal weights. We conduct extensive experiments and demonstrate the effectiveness of the proposed approach towards multi-class multiple model fusion.


Support Vector Machine Test Point Individual Model Bias Correction Neighbour Search 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning distance functions using equivalence relations. In: Proc. of ICML, pp. 11–18 (2003)Google Scholar
  2. 2.
    Bonissone, P.P.: Lazy meta-learning: creating customized model ensembles on demand. In: Liu, J., Alippi, C., Bouchon-Meunier, B., Greenwood, G.W., Abbass, H.A. (eds.) WCCI 2012. LNCS, vol. 7311, pp. 1–23. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Bonissone, P.P., Varma, A., Aggour, K.S., Xue, F.: Design of local fuzzy models using evolutionary algorithms. Comput. Stat. Data Anal. 51(1), 398–416 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Bonissone, P.P., Xue, F., Subbu, R.: Fast meta-models for local fusion of multiple predictive models. Appl. Soft Comput. 11(2), 1529–1539 (2011)CrossRefGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)zbMATHGoogle Scholar
  6. 6.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)CrossRefGoogle Scholar
  8. 8.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning, 273–297 (1995)Google Scholar
  9. 9.
    Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the l1-ball for learning in high dimensions. In: Proc. of ICML, pp. 272–279 (2008)Google Scholar
  10. 10.
    Grove, A.J., Schuurmans, D.: Boosting in the limit: Maximizing the margin of learned ensembles (1998)Google Scholar
  11. 11.
    Ho, T.K.: Random decision forest. In: Proc. of the 3rd International Conference on Document Analysis and Recognition, pp. 278–282. IEEE (1995)Google Scholar
  12. 12.
    Ho, T.K., Hull, J.J., Srihari, S.N.: Decision combination in multiple classifier systems. IEEE TPAMI 16(1), 66–75 (1994)CrossRefGoogle Scholar
  13. 13.
    Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Pro. of the National Academy of Sciences 79(8), 2554–2558 (1982)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Lam, L., Suen, S.Y.: Application of majority voting to pattern recognition: an analysis of its behavior and performance. Trans. Sys. Man Cyber. Part A 27(5), 553–568 (1997)CrossRefGoogle Scholar
  15. 15.
    Mayers, J.H., Forgy, E.W.: The development of numerical credit evaluation systems. Journal of the American Statistical Association, 799–806 (1963)Google Scholar
  16. 16.
    Schapire, R.: The boosting approach to machine learning: An overview (2003)Google Scholar
  17. 17.
    Shental, N., Hertz, T., Weinshall, D., Pavel, M.: Adjustment learning and relevant component analysis. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 776–790. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Vapnik, V.: Problems of empirical inference in machine learning and philosophy of science. Invited Talk at Tenth International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, Regina, Saskatchewan (2005)Google Scholar
  19. 19.
    Vapnik, V.: Estimation of Dependences Based on Empirical Data: Springer Series in Statistics. Springer-Verlag New York, Inc., Secaucus (1982)zbMATHGoogle Scholar
  20. 20.
    Verikas, A., Lipnickas, A., Malmqvist, K., Bacauskiene, M., Gelzinis, A.: Soft combination of neural classifiers: A comparative study. Pattern Recognition Letters 20(4), 429–444 (1999)CrossRefGoogle Scholar
  21. 21.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)CrossRefGoogle Scholar
  22. 22.
    Woods, K., Philip Kegelmeyer Jr., W., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE TPAMI 19(4), 405–410 (1997)CrossRefGoogle Scholar
  23. 23.
    Wu, L., Hoi, S.C.H., Jin, R., Zhu, J., Yu, N.: Learning bregman distance functions for semi-supervised clustering. IEEE TKDE 24(3), 478–491 (2012)Google Scholar
  24. 24.
    Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning, with application to clustering with side-information. In: Proc. of NIPS, pp. 505–512. MIT Press (2002)Google Scholar
  25. 25.
    Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man, and Cybernetics 22(3), 418–435 (1992)CrossRefGoogle Scholar
  26. 26.
    Xue, F., Subbu, R., Bonissone, P.P.: Locally weighted fusion of multiple predictive models. In: Proc. of IJCNN, pp. 2137–2143. IEEE (2006)Google Scholar
  27. 27.
    Yan, W., Xue, F.: Jet engine gas path fault diagnosis using dynamic fusion of multiple classifiers. In: Proc. of IJCNN, pp. 1585–1591. IEEE (2008)Google Scholar
  28. 28.
    Yang, T., Jin, R., Jain, A.K.: Learning from noisy side information by generalized maximum entropy model. In: Proc. of ICML (2010)Google Scholar
  29. 29.
    Yang, T., Mahdavi, M., Jin, R., Zhu, S.: An efficient primal-dual prox method for non-smooth optimization, arxiv (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Tianbao Yang
    • 1
  • Lei Wu
    • 1
  • Piero P. Bonissone
    • 1
  1. 1.GE Global Research CenterUSA

Personalised recommendations