Speaker adaptation with transformation matrix linear interpolation

Xiang-hua, Xu; Jie, Zhu

doi:10.1007/BF02850801

Speaker adaptation with transformation matrix linear interpolation

Published: November 2004

Volume 9, pages 927–930, (2004)
Cite this article

Wuhan University Journal of Natural Sciences

Xu Xiang-hua¹ &
Zhu Jie¹

38 Accesses
Explore all metrics

Abstract

A transformation matrix linear interpolation (TMLI) approach for speaker adaptation is proposed. TMLI uses the transformation matrixes produced by MLLR from selected training speakers and the testing speaker. With only 3 adaptation sentences, the performance shows a 12.12% word error rate reduction. As the number of adaptation sentences increases, the performance saturates quickly. To improve the behavior of TMLI for large amounts of adaptation data, the TMLI+MAP method which combines TMLI with MAP technique is proposed. Experimental results show TMLI+MAP achieved better recognition accuracy than MAP and MLLR+MAP for both small and large amounts of adaptation data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of Unintelligible Speech for MLLR and MAP-Based Speaker Adaptation

Robust Speaker Recognition Systems with Adaptive Filter Algorithms in Real Time Under Noisy Conditions

Speaker-Phrase-Specific Adaptation of PLDA Model for Improved Performance in Text-Dependent Speaker Verification

Article 10 April 2021

References

Gauvain J L, Lee C H. Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains.IEEE Trans SAP, 1994,2 (2): 291–298.
Google Scholar
Leggetter C J, Woodland P C. Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models.Computer Speech and Language, 1995,9 (2): 171–185.
Article Google Scholar
Wang Z Y, Liu F. Speaker Adaptation Using Maximum Likelihood Model Interpolation,Proce ICASSP, 1999,2: 1368–1372.
Google Scholar
Zhang J H, Tang X M.BAYES Methods. Changsha: National University of Defence Technology Publishing Company, 1993.
Google Scholar
Xu T Z,The Analysis of Applied Fonctionelle, Beijing: Science Press, 2002.
Google Scholar
Chang E, Zhou J L, Di S,et al. Large Vocabulary Mandarin Speech Recognition with Different Approaches in Modeling Tones.Proc ICSLP, 2000,2: 983–986.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Shanghai Jiaotong University, 200030, Shanghai, China
Xu Xiang-hua & Zhu Jie

Authors

Xu Xiang-hua
View author publications
You can also search for this author in PubMed Google Scholar
Zhu Jie
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Foundation item: Supported by the Science and Technology Committee of Shanghai (01JC14033)

Biography: XU Xiang-hua (1977-), female, Ph. D. candidate, research direction: large vocabulary continuous Mandarin speech recognition and speaker adaptation

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang-hua, X., Jie, Z. Speaker adaptation with transformation matrix linear interpolation. Wuhan Univ. J. Nat. Sci. 9, 927–930 (2004). https://doi.org/10.1007/BF02850801

Download citation

Received: 01 March 2004
Issue Date: November 2004
DOI: https://doi.org/10.1007/BF02850801

Key words

CLC number

TN 912. 34

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker adaptation with transformation matrix linear interpolation

Abstract

Access this article

Similar content being viewed by others

Analysis of Unintelligible Speech for MLLR and MAP-Based Speaker Adaptation

Robust Speaker Recognition Systems with Adaptive Filter Algorithms in Real Time Under Noisy Conditions

Speaker-Phrase-Specific Adaptation of PLDA Model for Improved Performance in Text-Dependent Speaker Verification

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Speaker adaptation with transformation matrix linear interpolation

Abstract

Access this article

Similar content being viewed by others

Analysis of Unintelligible Speech for MLLR and MAP-Based Speaker Adaptation

Robust Speaker Recognition Systems with Adaptive Filter Algorithms in Real Time Under Noisy Conditions

Speaker-Phrase-Specific Adaptation of PLDA Model for Improved Performance in Text-Dependent Speaker Verification

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation