Abstract
This paper describes our experiment with using the Gaussian mixture models (GMM) for evaluation of the speech quality produced by different methods of speech synthesis and parameterization. In addition, the paper analyzes and compares influence of different types of features and different number of mixtures used for GMM evaluation. Finally, the GMM evaluation scores are compared with the results obtained by the conventional listening tests based on the mean opinion score (MOS) evaluations. Results of evaluations obtained by these two ways are in correspondence.
The work has been supported by the Technology Agency of the Czech Republic, project No. TA01030476, the Grant Agency of the Slovak Academy of Sciences (VEGA 2/0090/11), and the Ministry of Education of the Slovak Republic (VEGA 1/0987/12).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Audibert, N., Vincent, D., Aubergé, V., Rosec, O.: Evaluation of Expresive Speech Resynthesis. In: Proceedings of LREC 2006 Workshop on Emotional Corpora, Gènes, pp. 37–40 (2006)
Iriondo, I., Planet, S., Socoró, J.C., Martínez, E., Alías, F., Monzo, C.: Automatic Refinement of an Expressive Speech Corpus Assembling Subjective Perception and Automatic Classification. Speech Communication 51, 744–758 (2009)
Takano, Y., Kondo, K.: Estimation of Speech Intelligibility Using Speech Recognition Systems. IEICE Transactions on Information and Systems E93D(12), 3368–3376 (2010)
Vích, R., Nouza, J., Vondra, M.: Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds.) HH and HM Interaction. LNCS (LNAI), vol. 5042, pp. 136–148. Springer, Heidelberg (2008)
Yun, S., Yoo, C.D.: Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification. IEEE Transactions on Audio, Speech, and Language Processing 20(2), 585–598 (2012)
Hosseinzadeh, D., Krishnan, S.: On the Use of Complementary Spectral Features for Speaker Recognition. EURASIP Journal on Advances in Signal Processing 2008, Article ID 258184, 10 pages (2008)
Lu, Y., Cooke, M.: The Contribution of Changes in F0 and Spectral Tilt to Increased Intelligibility of Speech Produced in Noise. Speech Communication 51(12), 1253–1262 (2009)
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation, and Gain Matching in Cepstral Speech Synthesis. In: Proceedings of the 15th Biennial EURASIP Conference Biosignal 2000, Brno, Czech Republic, pp. 77–82 (2000)
Madlová, A.: Autoregressive and Cepstral Parametrization in Harmonic Speech Modelling. Journal of Electrical Engineering 53, 46–49 (2002)
Grůber, M., Hanzlíček, Z.: Czech Expressive Speech Synthesis in Limited Domain Comparison of Unit Selection and HMM-Based Approaches. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 656–664. Springer, Heidelberg (2012)
Bishop, C.M., Nabney, I.T.: NETLAB Online Reference Documentation (accessed February 16, 2012), http://www.fizyka.umk.pl/netlab/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Přibil, J., Přibilová, A., Matoušek, J. (2013). Experiment with Evaluation of Quality of the Synthetic Speech by the GMM Classifier. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-40585-3_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)