Perceptual Effects of the Degree of Articulation in HMM-Based Speech Synthesis

  • Benjamin Picart
  • Thomas Drugman
  • Thierry Dutoit
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7015)

Abstract

This paper focuses on the understanding of the effects leading to high-quality HMM-based speech synthesis with various degrees of articulation. The adaptation of a neutral speech synthesizer to generate hypo and hyperarticulated speech is first performed. The impact of cepstral adaptation, of prosody, of phonetic transcription as well as the adaptation technique on the perceived degree of articulation is studied. For this, a subjective evaluation is conducted. It is shown that high-quality hypo and hyperarticulated speech synthesis requires the use of an efficient adaptation such as CMLLR. Moreover, in addition to prosody adaptation, the importance of cepstrum adaptation as well as the use of a Natural Language Processor able to generate realistic hypo and hyperarticulated phonetic transcriptions is assessed.

Keywords

Speech Synthesis Voice Conversion Speech Synthesizer Phonetic Transcription Speaker Adaptation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lindblom, B.: Economy of Speech Gestures. The Production of Speech. Springer, New-York (1983)CrossRefGoogle Scholar
  2. 2.
    Beller, G.: Analyse et Modèle Génératif de l’Expressivité - Application à la Parole et à l’Interprétation Musicale, PhD Thesis, Universit Paris VI - Pierre et Marie Curie, IRCAM (2009) (in French)Google Scholar
  3. 3.
    Beller, G., Obin, N., Rodet, X.: Articulation Degree as a Prosodic Dimension of Expressive Speech. In: Fourth International Conference on Speech Prosody, Campinas, Brazil (2008)Google Scholar
  4. 4.
    Picart, B., Drugman, T., Dutoit, T.: Analysis and Synthesis of Hypo and Hyperarticulated Speech. In: Proc. Speech Synthesis Workshop 7 (SSW7), Kyoto, Japan (2010)Google Scholar
  5. 5.
    Picart, B., Drugman, T., Dutoit, T.: Continuous Control of the Degree of Articulation in HMM-based Speech Synthesis. In: Proc. Interspeech, Firenze, Italy (2011)Google Scholar
  6. 6.
    Yamagishi, J., Nose, T., Zen, H., Ling, Z., Toda, T., Tokuda, K., King, S., Renals, S.: A Robust Speaker-Adaptive HMM-based Text-to-Speech Synthesis. IEEE Audio, Speech, & Language Processing 17(6), 1208–1230 (2009)CrossRefGoogle Scholar
  7. 7.
    Yamagishi, J., Masuko, T., Kobayashi, T.: HMM-based expressive speech synthesis – Towards TTS with arbitrary speaking styles and emotions. In: Proc. of Special Workshop in Maui, SWIM (2004)Google Scholar
  8. 8.
    Nose, T., Tachibana, M., Kobayashi, T.: HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker’s Voice Using Model Adaptation. IEICE Transactions on Information and Systems 92(3), 489–497 (2009)CrossRefGoogle Scholar
  9. 9.
    HMM-based Speech Synthesis System (HTS), http://hts.sp.nitech.ac.jp/
  10. 10.
    Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)CrossRefGoogle Scholar
  11. 11.
    Drugman, T., Wilfart, G., Dutoit, T.: A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis. In: Proc. Interspeech, Brighton, U.K. (2009)Google Scholar
  12. 12.
    Digalakis, V., Rtischev, D., Neumeyer, L.: Speaker adaptation using constrained reestimation of Gaussian mixtures. IEEE Trans. Speech Audio Process. 3(5), 357–366 (1995)CrossRefGoogle Scholar
  13. 13.
    Gales, M.: Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12(2), 75–98 (1998)CrossRefGoogle Scholar
  14. 14.
    Ferguson, J.: Variable Duration Models for Speech. In: Proc. Symp. on the Application of Hidden Markov Models to Text and Speech, pp. 143–179 (1980)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Benjamin Picart
    • 1
  • Thomas Drugman
    • 1
  • Thierry Dutoit
    • 1
  1. 1.TCTS Lab, Faculté Polytechnique (FPMs)University of Mons (UMons)Belgium

Personalised recommendations