SVM Based Speaker Selection Using GMM Supervector for Rapid Speaker Adaptation

Wang, Jian; Lei, Jianjun; Guo, Jun; Yang, Zhen

doi:10.1007/11903697_78

Jian Wang²⁴,
Jianjun Lei²⁴,
Jun Guo²⁴ &
…
Zhen Yang²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4247))

Included in the following conference series:

Asia-Pacific Conference on Simulated Evolution and Learning

1438 Accesses

Abstract

In this paper, we propose a novel method for rapid speaker adaptation called speaker support vector selection (SSVS). By taking gaussian mixture model (GMM) as speaker model, the speakers acoustically close to the test speaker are selected .Different from other selection method, just computing the likelihood between models, we utilizing support vector machines (SVM) to obtain a ‘more optimal speaker subset’. Such selection is dynamically determined according to the distribution of reference speakers close the test. Furthermore, a single-pass re-estimation procedure conditioned on the selected speakers is shown. This adaptation strategy was evaluated in a large vocabulary speech recognition task. The presented method improves the relative accuracy rates by 13% compared to the baseline system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Legetter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density HMM’s in Compute. Speech Lang. 9, 171–186 (1996)
Article Google Scholar
Gauvain, J.L., Lee, C.H.: Maximum a posterior estimation for multivariate Gaussian observations of Markov chains. IEEE Trans. Speech Audio Processing 2, 291–298 (1994)
Article Google Scholar
Sankar, A., Beaufays, F., Digalakis, V.: Training data clustering for improved speech recognition. In: Proc. Eurospeech, pp. 502–505 (1995)
Google Scholar
Huang, C., Chen, T., Chang, E.: Speaker Selection Training for Large Vocabulary Continuous Speech Recognition. In: Proc. ICASSP (2002)
Google Scholar
Reynolds, D.A., Rose, R.C.: Robust text dependent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Article Google Scholar
Reynolds, D.A., Quatieri, T., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Knowledge Discovery Data Mining 2(2), 121–167 (1998)
Article Google Scholar
Gunn, S.R.: Support vector machines for classification and regression. Technical Report Image Speech and Intelligent Systems Research Group, University of Southampton (1997)
Google Scholar
Schmidt, M., Gish, H.: Speaker identification via support vector classifiers. In: Proc. ICASSP, pp. 105–108 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Engineering, Beijing University of Posts and Telecommunications, 100876, Beijing, China
Jian Wang, Jianjun Lei, Jun Guo & Zhen Yang

Authors

Jian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianjun Lei
View author publications
You can also search for this author in PubMed Google Scholar
Jun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Industrial Engineering and Management, Cheng Shiu University, Kaohsiung County Taiwan, ROC
Tzai-Der Wang
School of Computer Science and information Technology, RMIT University, VIC 3001, Melbourne, Australia
Xiaodong Li
AI-ECON Research Center, Department of Economics, National Chengchi University, 11623, Taipei, Taiwan
Shu-Heng Chen
Department of Computer Science and Technology, University of Sci. & Tech. of China, 230026, Hefei, Anhui, P.R. China
Xufa Wang
School of Information Technology and Electrical Engineering, University of New South Wales, Australian Defence Force Academy, Canberra, Australia
Hussein Abbass
The Univertity of Tokyo, Japan
Hitoshi Iba
Department of Computer, University of Science and Technology of China, 230027, Hefei, China
Guo-Liang Chen
University of Birmingham, Birmingham, UK
Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Lei, J., Guo, J., Yang, Z. (2006). SVM Based Speaker Selection Using GMM Supervector for Rapid Speaker Adaptation. In: Wang, TD., et al. Simulated Evolution and Learning. SEAL 2006. Lecture Notes in Computer Science, vol 4247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11903697_78

Download citation

DOI: https://doi.org/10.1007/11903697_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47331-2
Online ISBN: 978-3-540-47332-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics