Experiment with GMM-Based Artefact Localization in Czech Synthetic Speech

  • Jiří Přibil
  • Anna Přibilová
  • Jindřich Matoušek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9302)

Abstract

The paper describes an experiment with using the statistical approach based on the Gaussian mixture models (GMM) for localization of artefacts in the synthetic speech produced by the Czech text-to-speech system employing the unit selection principle. In addition, the paper analyzes influence of different number of used GMM mixtures, and the influence of setting of the frame shift during the spectral feature analysis on the resulting artefact position accuracy. Obtained results of performed experiments confirm proper function of the chosen concept and the presented artefact position localizer can be used as an alternative to the standardly applied manual localization method.

Keywords

Quality of synthetic speech Text-to-speech system GMM classification Statistical analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tihelka, D., Kala, J., Matoušek, J.: Enhancements of viterbi search for fast unit selection synthesis. In: Proceedings of Interspeech 2010, Makuhari, Japan, pp. 174–177 (2010)Google Scholar
  2. 2.
    Matoušek, J., Tihelka, D.: Annotation errors detection in TTS corpora. In: Proceeding of Interspeech 2013, Lyon, France, pp. 1511–1515 (2013)Google Scholar
  3. 3.
    Legát, M., Matoušek, J.: Identifying concatenation discontinuities by hierarchical divisive clustering of pitch contours. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 171–178. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  4. 4.
    Tihelka, D., Matoušek, J., Kala, J.: Quality deterioration factors in unit selection speech synthesis. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 508–515. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  5. 5.
    Legát, M., Tihelka, D., Matoušek, J.: Configuring TTS evaluation method based on unit cost outlier detection. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 177–184. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  6. 6.
    Bello, C., Ribas, D., Calvo, J.R., Ferrer, C.A.: From speech quality measures to speaker recognition performance. In: Bayro-Corrochano, E., Hancock, E. (eds.) CIARP 2014. LNCS, vol. 8827, pp. 199–206. Springer, Heidelberg (2014) Google Scholar
  7. 7.
    Juang, B.H., Rabiner, L.R.: Hidden Markov Models for Speech Recognition. Technometrics 33(3), 251–272 (1991)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)CrossRefGoogle Scholar
  9. 9.
    Togneri, R., Pullella, D.: An Overview of Speaker Identification: Accuracy and Robustness Issues. IEEE Circuits and Systems Magazine 11(2), 23–61 (2011)CrossRefGoogle Scholar
  10. 10.
    Přibil, J., Přibilová, A., Matoušek, J.: Detection of artefacts in Czech synthetic speech based on ANOVA statistics. In: Proc. of the 37th International Conference on Telecommunications and Signal Processing TSP 2014, Berlin, Germany, pp. 414–418 (2014)Google Scholar
  11. 11.
    Venturini, A., Zao, L., Coelho, R.: On Speech Features Fusion, \(\alpha \)-Integration Gaussian Modeling and Multi-Style Training for Noise Robust Speaker Classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(12), 1951–1964 (2014)CrossRefGoogle Scholar
  12. 12.
    Shah, M., Chakrabarti, C., Spanias, A.: Within and Cross-Corpus Speech Emotion Recognition Using Latent Topic Model-Based Features. EURASIP Journal on Audio, Speech, and Music Processing 2015(4), 1–17 (2015)Google Scholar
  13. 13.
    Přibil, J., Přibilová, A.: Evaluation of Influence of Spectral and Prosodic Features on GMM Classification of Czech and Slovak Emotional Speech. EURASIP Journal on Audio, Speech, and Music Processing 2013(8), 1–22 (2013)Google Scholar
  14. 14.
    Nabney, I.T.: Netlab Pattern Analysis Toolbox (retrieved October 2, 2013). http://www.mathworks.com/matlabcentral/fileexchange/2654-netlab

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jiří Přibil
    • 1
    • 2
  • Anna Přibilová
    • 3
  • Jindřich Matoušek
    • 1
  1. 1.Department of Cybernetics, Faculty of Applied SciencesUniversity of West BohemiaPlzeňCzech Republic
  2. 2.SAS, Institute of Measurement ScienceBratislavaSlovakia
  3. 3.Faculty of Electrical Engineering and Information Technology, Institute of Electronics and PhotonicsSlovak University of TechnologyBratislavaSlovakia

Personalised recommendations