In this paper, we propose a new method for training mixtures of linear SVM classifiers for purposes of non-linear data classification. We do this by packaging linear SVMs into a probabilistic formulation and embedding them in the mixture of experts model. The weights of the mixture model are generated by the gating network dependent on the input data. The new mixture of linear SVMs can be then trained efficiently using the EM algorithm. Unlike previous SVM-based mixture of expert models, which use a divide-and-conquer strategy to reduce the burden of training for large scale data sets, the main purpose of our approach is to improve the efficiency for testing. Experimental results show that our proposed model can achieve the efficiency of linear classifiers in the prediction phase while still maintaining the classification performance of nonlinear classifiers.


Expert Model Nonlinear Data Skin Detection Linear SVMs Expert Framework 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: ACM Conf. on Computational Learning Theory (1992)Google Scholar
  2. 2.
    Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Computation 3, 79–87 (1991)CrossRefGoogle Scholar
  3. 3.
    Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the em algorithm. Neural Computation 6, 181–214 (1994)CrossRefGoogle Scholar
  4. 4.
    Kwok, J.T.: Support vector mixture for classification and regression problems. In: Intl. Conf. on Pattern Recognition (1998)Google Scholar
  5. 5.
    Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of svms for very large scale problems. In: Advances in Neural Information Processing Systems (2002)Google Scholar
  6. 6.
    Kruger, S., Schaffoner, M., Katz, M., Andelic, E., Wendemuth, A.: Mixture of support vector machine for hmm based speech recognition. In: Intl. Conf. on Pattern Recognition (2006)Google Scholar
  7. 7.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: Advances in Neural Information Processing Systems, vol. 10 (1998)Google Scholar
  8. 8.
    Joachims, T.: Training linear svms in linear time. In: SIGKDD (2006)Google Scholar
  9. 9.
    Nocedal, J., Wright, S.: Numerical Optimization. Springer, Heidelberg (2000)zbMATHGoogle Scholar
  10. 10.
    Fan, R.E., Chen, P.H., Lin, C.J.: Working set selection using the second order information for training svm. Journal of Machine Learning Research 6, 1889–1918 (2005)zbMATHGoogle Scholar
  11. 11.
    Fu, Z., Robles-Kelly, A., Tan, R., Caelli, T.: Invariant object material identification via discriminant learning on absorption features. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition Workshop on Object Tracking and Classification Beyond the Visible Spectrum, pp. 140–147 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Zhouyu Fu
    • 1
  • Antonio Robles-Kelly
    • 1
    • 2
  1. 1.RSISE, Bldg. 115Australian National UniversityCanberraAustralia
  2. 2.National ICT Australia (NICTA)CanberraAustralia

Personalised recommendations