Skip to main content
Log in

Dealing with Prosody in a Text-to-Speech System

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The task of assigning appropriate intonation to syntheticspeech is one that requires knowledge of linguistic structure as well ascomputational possibilities. This paper surveys the basic challengesfacing the designer of a text-to-speech system, and reviews some of theperspectives on these problems that have been developed in thelinguistic literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bloomfield, L. (1993). Language. Chicago: University of Chicago Press.

    Google Scholar 

  • Bolinger, D. (1958). A theory of pitch accent in English. Word, 14:109–149. Reprinted in Bolinger (1965).

    Google Scholar 

  • Bolinger, D. (1965). Forms of English: Accent, Morpheme, Order. Cambridge: Harvard University Press.

    Google Scholar 

  • Chomsky, N. and Halle, M. (1968). The Sound Pattern of English. New York: Harper and Row.

    Google Scholar 

  • Daelemans, W. and Bosch, A.v.d. (1996). Language-independent data-oriented grapheme-to-phoneme conversion. In J. Van Santen, R. Sproat, J. Olive, and J. Hirschberg (Eds.), Progress in Speech Synthesis. New York: Springer Verlag, pp. 77–90.

    Google Scholar 

  • Divay, M. and Vitale, A.J. (1997). Algorithms for graphemephoneme translation for English and French: Applications for database searches and speech synthesis. Computational Linguistics, 23(4):495–523.

    Google Scholar 

  • Goldsmith, J. (1976). Autosegmental Phonology. Ph.D. dissertation, Massachusetts Institute of Technology. Cambridge. (Reprinted by Garland Press, New York, 1979).

    Google Scholar 

  • Goldsmith, J. (1980 [1974]). English as a tone language. In D. Goyvaerts (Ed.), Phonology in the 1980s. Gent: Story-Scientia. (Circulated as an unpublished paper, 1974, MIT.)

    Google Scholar 

  • Hayes, B. (1995). Metrical Stress Theory: Principles and Case Studies. Chicago: University of Chicago Press.

    Google Scholar 

  • Hirschberg, J. (1993). Pitch accent in context: Predicting intonational prominence from text. Artificial Intelligence, 63(1–2):305–340.

    Google Scholar 

  • Huang, X., Acero, A., Adcock, J., Hon, H.-W., Goldsmith, J., Liu, J., and Plumpe, M. (1995). Whistler: A Trainable Text-to-Speech System. Presented at Fourth International Conference on Spoken Language Processing, Philadelphia, PA.

  • Ladd, D.R. (1992). An introduction to intonational phonology. In G.J. Docherty and D.R. Ladd (Eds.), Papers in Laboratory Phonology II: Gesture, Segment, Prosody. New York: Cambridge University Press, pp. 321–334.

    Google Scholar 

  • Liberman, M. (1975). The Intonational System of English. Ph.D. dissertation, Massachusetts Institute of Technology.

  • Liberman, M. and Church, K. (1992). Text analysis and word pronunciation in text-to-speech synthesis. In S. Furui and M. M. Sondhi (Eds.), Advances in Speech Technology. NewYork: Marcel Dekker, pp. 791–831.

    Google Scholar 

  • Liberman, M. and Sag, I. (1974). Prosodic form and discourse function. In M.W. LaGaly, R.A. Fox, and A. Bruck (Eds.), Papers from the 10th Regional Meeting. Chicago: Chicago Linguistic Society, pp. 416–427.

    Google Scholar 

  • McCawley, J. (1994). Some graphotactic constraints. In W.C. Watt (Ed.), Writing Systems and Cognition. Dordrecht: Kluwer, pp. 115–127.

    Google Scholar 

  • Ostendorf, M. and Veilleux, N. (1994). A hierarchical stochastic model for automatic prediction of prosodic boundary location. Computational Linguistics. 20(1):27–54.

    Google Scholar 

  • Pierrehumbert, J. (1980). The Phonology and Phonetics of English Intonation. Ph.D. dissertation, Massachusetts Institute of Technology.

  • Pierrehumbert, J. (1981). Synthesizing intonation. Journal of the Acoustical Society of America, 70:985–995.

    Google Scholar 

  • Pike, K. (1945). The Intonation of American English. Ann Arbor: University of Michigan.

    Google Scholar 

  • Wang, M.Q. and Hirschberg, J. (1992). Automatic classification of intonational phrase boundaries. Computer Speech and Language, 6:175–196.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goldsmith, J. Dealing with Prosody in a Text-to-Speech System. International Journal of Speech Technology 3, 51–63 (1999). https://doi.org/10.1023/A:1009678810697

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009678810697

Navigation