SVM Based Speaker Selection Using GMM Supervector for Rapid Speaker Adaptation

  • Jian Wang
  • Jianjun Lei
  • Jun Guo
  • Zhen Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4247)


In this paper, we propose a novel method for rapid speaker adaptation called speaker support vector selection (SSVS). By taking gaussian mixture model (GMM) as speaker model, the speakers acoustically close to the test speaker are selected .Different from other selection method, just computing the likelihood between models, we utilizing support vector machines (SVM) to obtain a ‘more optimal speaker subset’. Such selection is dynamically determined according to the distribution of reference speakers close the test. Furthermore, a single-pass re-estimation procedure conditioned on the selected speakers is shown. This adaptation strategy was evaluated in a large vocabulary speech recognition task. The presented method improves the relative accuracy rates by 13% compared to the baseline system.


Support Vector Machine Hide Markov Model Gaussian Mixture Model Speaker Identification Hide Markov Model Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Legetter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density HMM’s in Compute. Speech Lang. 9, 171–186 (1996)CrossRefGoogle Scholar
  2. 2.
    Gauvain, J.L., Lee, C.H.: Maximum a posterior estimation for multivariate Gaussian observations of Markov chains. IEEE Trans. Speech Audio Processing 2, 291–298 (1994)CrossRefGoogle Scholar
  3. 3.
    Sankar, A., Beaufays, F., Digalakis, V.: Training data clustering for improved speech recognition. In: Proc. Eurospeech, pp. 502–505 (1995)Google Scholar
  4. 4.
    Huang, C., Chen, T., Chang, E.: Speaker Selection Training for Large Vocabulary Continuous Speech Recognition. In: Proc. ICASSP (2002)Google Scholar
  5. 5.
    Reynolds, D.A., Rose, R.C.: Robust text dependent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)CrossRefGoogle Scholar
  6. 6.
    Reynolds, D.A., Quatieri, T., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  7. 7.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)MATHGoogle Scholar
  8. 8.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Knowledge Discovery Data Mining 2(2), 121–167 (1998)CrossRefGoogle Scholar
  9. 9.
    Gunn, S.R.: Support vector machines for classification and regression. Technical Report Image Speech and Intelligent Systems Research Group, University of Southampton (1997)Google Scholar
  10. 10.
    Schmidt, M., Gish, H.: Speaker identification via support vector classifiers. In: Proc. ICASSP, pp. 105–108 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jian Wang
    • 1
  • Jianjun Lei
    • 1
  • Jun Guo
    • 1
  • Zhen Yang
    • 1
  1. 1.School of Information EngineeringBeijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations