Speaker identification using multi-step clustering algorithm with transformation-based GMM

Xu, Limin; Tang, Zhenmin

doi:10.3103/S0146411607040062

Speaker identification using multi-step clustering algorithm with transformation-based GMM

Published: August 2007

Volume 41, pages 224–231, (2007)
Cite this article

Automatic Control and Computer Sciences Aims and scope Submit manuscript

Limin Xu¹ &
Zhenmin Tang²

76 Accesses
3 Citations
Explore all metrics

Abstract

To improve the performance of speaker recognition, the embedded linear transformation is used to integrate both transformation and diagonal-covariance Caussian mixture into a unified framework. In the case, the mixture number of GMM must be fixed in model training. The cluster expectation-maximization (EM) algorithm is a well-known technique in which the mixture number is regarded as an estimated parameter. This paper presents a new model structure that integrates a multi-step cluster algorithm into the estimating process of GMM with the embedded transformation. In the approach, the transformation matrix, the mixture number and model parameters are simultaneously estimated according to a maximum likelihood criterion. The proposed method is demonstrated on a database of three data sessions for text independent speaker identification. The experiments show that this method outperforms the traditional GMM with cluster EM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

References

Furui, S., An Overview of Speaker Recognition Technology, Automatic Speech and Speaker Recognition, Lee, C., Soong, F., and Paliwal, K., Eds., Kluwer Academic Press, 1996.
Reynolds, D.A. and Rose, R.C., Robust Text-independent Speaker Identification Using Gaussian Mixture Speaker models, IEEE Trans. Speech Audio Process, 1995, vol. 3, no. 1, pp. 72–83.
Article Google Scholar
You, K.-H. and Wang, H.-C., Joint Estimation of Feature Transformation Parameters and Gaussian mixture Model for Speaker identification, Speech Communication, 1999, vol. 28, pp. 227–241.
Article Google Scholar
Gong, J.P., On MMI Learning of Gaussian Mixture for Speaker Models (Proc. EUROSPEECH’95), 1995, pp. 363–366.
Hong, Q.Y. and Kwong, S., Discriminative Training for Speaker Identification Based on Maximum Model Distance Algorithm (Proc. IEEE Int. Conf. on Acoustic, Speech, and Signal Process), 2004, vol. 1, pp. 25–28.
Google Scholar
Hong, Q.Y. and Kwong, S., A Discriminative Training Approach for Text-independent Speaker Recognition, Signal Processing, 2005, vol. 85, pp. 1449–1463.
Article Google Scholar
Ljolje, A., The Importance of Cepstral Parameter Correlations in Speech Recognition, Computer Speech and Language, 1994, vol. 8, pp. 223–232.
Article Google Scholar
Chen, C.-C.T., Chen, C.-T., and Hou, C.-K., Speaker Identification Using Hybrid Karhunen-Loeve transform and Gaussian mixture model approach, Pattern Recognition, 2004, vol. 37, pp. 1073–1075.
Article Google Scholar
Fukunaga, K., Introduction to Statistical Pattern Recognition, Academic Press, 1990.
Boulis, C., Diakoloukas, V., and Digalakis, V., Maximum Likelihood Stochastic Transformation Adaptation for Medium and Small Data Sets, Computer Speech and Language, 2001, vol. 15, pp. 257–285.
Article Google Scholar
Bouman, C.A., Cluster: An Unsupervised Algorithm for Modeling Gaussian Mixtures, http://www.ece.purdue.edu/:_bouman. 2005.7.
Rissanen, J., A Universal Prior for Integers and Estimation by Minimum Description Length, Annals of Statistics, 1983, vol. 11, no. 2, pp. 417–431.
MathSciNet Google Scholar
Crunwald, P.D., Model selection based on minimum description length. Journal of Mathematical Psychology, 2000, vol. 44, no. 1, pp. 133–152.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Commerce, School of International Economics and business Nanjing University of Finance and Economics, 210094, Nanjing, Jiangsu, China
Limin Xu
School of Computer Science, Nanjing University of Science and Technology, 210094, Nanjing, Jiangsu, China
Zhenmin Tang

Authors

Limin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenmin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Limin Xu.

Additional information

This text was submitted by the authors in English.

About this article

Cite this article

Xu, L., Tang, Z. Speaker identification using multi-step clustering algorithm with transformation-based GMM. Aut. Conrol Comp. Sci. 41, 224–231 (2007). https://doi.org/10.3103/S0146411607040062

Download citation

Received: 18 April 2007
Accepted: 20 October 2006
Issue Date: August 2007
DOI: https://doi.org/10.3103/S0146411607040062

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker identification using multi-step clustering algorithm with transformation-based GMM

Abstract

Access this article

Similar content being viewed by others

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Key words

Navigation

Speaker identification using multi-step clustering algorithm with transformation-based GMM

Abstract

Access this article

Similar content being viewed by others

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

Robust Speaker Identification Algorithms and Results in Noisy Environments

References

Author information

Authors and Affiliations

Corresponding author

Additional information

About this article

Cite this article

Share this article

Key words

Search

Navigation