Skip to main content
Log in

A Metrical Model of Prosody for Multilingual TTS

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The model of prosody used in the Aculab TTS system is unusual in several respects. Firstly, it is based firmly on current metrical theories of prosody. Secondly, it is entirely knowledge-based: there are no stochastic components in the model. Thirdly, it makes use of a quasi-random element to avoid the predictability of conventional synthetic prosody. Fourthly, it is specifically designed for multilingual use: it currently handles several Germanic and Romance languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arvaniti, A. and Ladd, D.R. (1995). Tonal alignment and the representation of accentual targets. Proceedings of ICPhS. Stockholm, Sweden, vol. 4, pp. 220-223.

    Google Scholar 

  • Avesani, C. (1990). A contribution to the synthesis of Italian intonation. Proceedings of ICSLP. Kobe, Japan, vol. 1, pp. 834-836.

    Google Scholar 

  • Avesani, C. (1995). ToBIt. Un sistema di trascrizione per l'intonazione italiana. Atti delle V Giornate di Studio del Gruppo di Fonetica Sperimentale. Trento, Italy, November 1994, pp. 85-98.

  • Burnett, D.C., Walker, M.R., and Hunt, A. (2002). Speech Synthesis Markup Language Specification. Retrieved 8 April, 2002, from http://www.w3.org/TR/speech-synthesis/.

  • Campbell, W.N., Isard, S.D., Monaghan, A., and Verhoeven, J. (1990). Duration, pitch and diphones in the CSTR TTS system. Proceedings of ICSLP. Kobe, Japan, vol. 2, pp. 825-828.

    Google Scholar 

  • Crystal, D. (1969). Prosodic Systems and Intonation in English. Cambridge, England: Cambridge University Press.

    Google Scholar 

  • Di Cristo, A. (1998). Intonation in French. In D. Hirst and A. Di Cristo (Eds.), Intonation Systems. Cambridge, England: Cambridge University Press, pp. 195-218.

    Google Scholar 

  • Di Cristo, A. (1999). Le Cadre Accentuel du Français: Essai de Modélisation. Langues, 2:184-205 and 258-269.

    Google Scholar 

  • Di Cristo, A., Di Cristo, P., Campione, E., and Véronis, J. (2000). A prosodic model for text-to-speech synthesis in French. In A. Botinis (Ed.), Intonation: Analysis, Modelling and Technology. The Netherlands: Kluwer, Amsterdam, pp. 321-355.

    Google Scholar 

  • Frota, S. (1998). Prosody and Focus in European Portuguese. Doctoral dissertation, Universidade de Lisboa, Portugal.

  • Garrido, J.M. (1996). Modelling Spanish Intonation for Text-to-Speech Applications. Ph.D. Thesis, Universidad Aut`onoma de Barcelona, Spain.

  • Gee, J.P. and Grosjean, M. (1983). Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology, 15:411-458.

    Google Scholar 

  • Gua¨tella, I. (1991). Rhythme et Parole. Doctoral dissertation, Universit é de Provence, France.

  • Ladd, D.R. (1987). A phonological model of intonation for use in speech synthesis by rule. Proceedings of the European Conference on Speech Technology. Edinburgh, Scotland, vol.2, pp. 21-24.

    Google Scholar 

  • Liberman, M. and Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8:249-336.

    Google Scholar 

  • Monaghan, A.I.C. (1990). Rhythm and stress shift in speech synthesis. Computer Speech and Language, 4:71-78.

    Google Scholar 

  • Monaghan, A.I.C. (1991). Intonation in a Text-to-Speech Conversion System. Ph.D. Thesis, University of Edinburgh, Scotland.

    Google Scholar 

  • Monaghan, A.I.C. (1992). Heuristic strategies for higher-level analysis of unrestricted text. In G. Bailly and C. Benoit (Eds.), Talking Machines, Amsterdam, The Netherlands: Elsevier, pp. 143-161.

    Google Scholar 

  • Monaghan, A.I.C. (1993). What determines accentuation? Journal of Pragmatics, 19:559-584.

    Google Scholar 

  • Monaghan, A., Kassaei, M., Luckin, M., Amador-Hernandez, M., Lowry, A., Faulkner, D., and Sannier, F. (2001). Multilingual TTS for computer telephony: The Aculab approach. Proceedings of Eurospeech. Aalborg, Denmark, vol. 1, pp. 513-516.

    Google Scholar 

  • Prieto, P. (1997). Register shift in Spanish downstepping contours. Proceedings of theESCAWorkshop on Intonation. Athens, Greece, pp. 275-278.

  • Prieto, P. and Shih, C. (1995). Effects of tonal clash on downstepped H* accents in Spanish. Proceedings of Eurospeech. Madrid, Spain, vol. 2, pp. 1307-1310.

    Google Scholar 

  • Prieto, P., Shih, C., and Nibert, H. (1996). Pitch downtrend in Spanish. Journal of Phonetics, 24:445-473.

    Google Scholar 

  • Prince, A. (1983). Relating to the grid. Linguistic Inquiry, 14:19-100.

    Google Scholar 

  • Santi, S. (1992). Synthèse Vocale de Sons du Français. Doctoral dissertation, Université de Provence, France.

  • Teixeira, J.P., Freitas, D., Braga, D., Barros, M.J., and Latsch, V. (2001). Phonetic events from labeling the European Portuguese database for speech synthesis. Proceedings of Eurospeech. Aalborg, Denmark, vol. 3, pp. 1707-1711.

    Google Scholar 

  • Vazquez-Alvarez, Y. (2001). Text-to-Speech (TTS) Synthesis Evaluation. MSc dissertation, University College London, England.

  • Vazquez-Alvarez, Y. and Huckvale, M. (2002). The reliability of the ITU-T P.85 standard for the evaluation of text-to-speech systems. Proceedings of ICSLP. Denver, USA, vol. 1, pp. 329-332.

    Google Scholar 

  • Zellner, B. (1998). Caractérisation et Prédiction du Débit de Parole en Français. Doctoral dissertation, Université de Lausanne, Switzerland.

  • Zellner Keller, B. and Keller, E. (2001). Representing speech rhythm. In E. Keller, G. Bailly, A. Monaghan, J. Terken, and M. Huckvale (Eds.), Improvements in Speech Synthesis. Chichester, England: John Wiley, pp. 154-164.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Monaghan, A. A Metrical Model of Prosody for Multilingual TTS. International Journal of Speech Technology 6, 73–81 (2003). https://doi.org/10.1023/A:1021056124145

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1021056124145

Navigation