Skip to main content
Log in

Speaker adaptation with transformation matrix linear interpolation

  • Published:
Wuhan University Journal of Natural Sciences

Abstract

A transformation matrix linear interpolation (TMLI) approach for speaker adaptation is proposed. TMLI uses the transformation matrixes produced by MLLR from selected training speakers and the testing speaker. With only 3 adaptation sentences, the performance shows a 12.12% word error rate reduction. As the number of adaptation sentences increases, the performance saturates quickly. To improve the behavior of TMLI for large amounts of adaptation data, the TMLI+MAP method which combines TMLI with MAP technique is proposed. Experimental results show TMLI+MAP achieved better recognition accuracy than MAP and MLLR+MAP for both small and large amounts of adaptation data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Gauvain J L, Lee C H. Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains.IEEE Trans SAP, 1994,2 (2): 291–298.

    Google Scholar 

  2. Leggetter C J, Woodland P C. Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models.Computer Speech and Language, 1995,9 (2): 171–185.

    Article  Google Scholar 

  3. Wang Z Y, Liu F. Speaker Adaptation Using Maximum Likelihood Model Interpolation,Proce ICASSP, 1999,2: 1368–1372.

    Google Scholar 

  4. Zhang J H, Tang X M.BAYES Methods. Changsha: National University of Defence Technology Publishing Company, 1993.

    Google Scholar 

  5. Xu T Z,The Analysis of Applied Fonctionelle, Beijing: Science Press, 2002.

    Google Scholar 

  6. Chang E, Zhou J L, Di S,et al. Large Vocabulary Mandarin Speech Recognition with Different Approaches in Modeling Tones.Proc ICSLP, 2000,2: 983–986.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Foundation item: Supported by the Science and Technology Committee of Shanghai (01JC14033)

Biography: XU Xiang-hua (1977-), female, Ph. D. candidate, research direction: large vocabulary continuous Mandarin speech recognition and speaker adaptation

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang-hua, X., Jie, Z. Speaker adaptation with transformation matrix linear interpolation. Wuhan Univ. J. Nat. Sci. 9, 927–930 (2004). https://doi.org/10.1007/BF02850801

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02850801

Key words

CLC number

Navigation