Skip to main content

NLP Architectures for TTS Synthesis

  • Chapter
  • 362 Accesses

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 3))

Abstract

Text analysis and, better still, text understanding are some of the biggest challenges taken up by artificial intelligence. Over the last thirty-five years, computational linguistics specialists have proposed an impressive number of linguistic formalisms and inference methods to tackle these problems. Is it not surprising, under these conditions, that few of the speech synthesis systems commercialized up to now embody a somewhat complete parser to properly examine the sentences to pronounce? And how do we explain the desperate eagerness of specialists to obtain a reasonable synthetic speech quality with just some simple syntactic heuristics, while research teams in natural language processing (NLP) now focus on the much more complex task of analyzing semantics and pragmatics in the context of automatic understanding, translation, or production of natural language?

La grammaire est l’art de lever les ambiguités de la langue; mais il nefautpas que le levier soit plus lourd quelefardeau. (Grammar is the art of removing the ambiguities of a language; but the lever should not be heavier than the burden). A. Rivarol, (1784), De l’Universalité de la Langue

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • ALLEN, J., S. HUNNICUT, and D. KLATT, (1987), From Text to Speech, The MITALK System, Cambridge University Press, Cambridge.

    Google Scholar 

  • BäCKSTRöM, M., K. CEDER, and B. LYBERG, (1989), “PROPHON- An Interactive Environment for Text-To-Speech Conversion”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 144–148.

    Google Scholar 

  • BAILLY, G., and A. TRAN, (1989), “Compost, a Rule-Compiler for Speech Synthesis”, Proceedings of Eurospeech 89, Paris, pp. 136–139.

    Google Scholar 

  • BöHM, A., (1992), Maschinelle Sprachausgabe Deutschen und Englishe Textes, Ph.D. dissertation, Ruhr-Universität Bochum.

    Google Scholar 

  • CARLSON, R., and B. GRANSTRöM, (1976), “A Text-to-Speech System Based Entirely on Rules”, Proceedings of ICASSP 76, Philadelphia, pp. 686–688.

    Google Scholar 

  • CEDER, K., and B. LIBERG, (1992), “Yet Another Rule Compiler for Text-to-Speech Conversion?”, Proceedings of the International Conference on Spoken Language Processing, Alberta, pp. 1151–1154.

    Google Scholar 

  • CERICOLA, D., M. DANIELI, M.J. MOLLO, and D. VOLTOLINI, (1989), “Morpho-Syntactic Tools for Speech Processing”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 386–389.

    Google Scholar 

  • DUTOIT, T., (1993), High Quality Text-To-Speech Synthesis of the French Language, Ph. D. dissertation, Faculté Polytechnique de Mons.

    Google Scholar 

  • ENGELMORE R., and T. MORGAN, (1988), Blackboard Systems, Addison-Wesley, Padstow, Great-Britain.

    Google Scholar 

  • ERMAN, L.D., F. HAYES-ROTH, V. LESSER, and D.R. REDDY, (1980), “The HEARSAY II Speech Understanding System: Integrating Knowledge to Resolve Uncertainty”, Computing Surveys, 12, 2, pp. 213–253.

    Article  Google Scholar 

  • FRENKENBERGER, S., M. KOMMENDA, and S. MOSSMüLLER, (1991), “A Multi-Linear Representation and Rule Formalism for Phonetics in Text-to-Speech Synthesis”, Proeedings of the XII’ th International Congress of Phonetic Sciences, Aix-en-Provence, vol. 2, pp. 518–521.

    Google Scholar 

  • HERTZ, S.S., (1982), “From Text to Speech with SRS”, Journal of the Acoustical Society of America, n° 72, pp. 1155–1170.

    Google Scholar 

  • HERTZ, S.R., J. KADIN, and K. KARPLUS, (1985), “The DELTA Rule Ddevelopment System for Speech Synthesis from Text”, IEEE Proceedings, on Acoustics, Speech, and Signal Processing, n°73, pp. 1589–1601.

    Google Scholar 

  • KARLSSON, F., (1990), “Constraint Grammars as a Framework for Parsing Running Text”, Proceedings of the Conference on Computational Linguistics, Helsinki, vol. 3, pp. 168–173.

    Google Scholar 

  • LAZZARETTO, S., and S. NEBBIA, (1987), “SCYLA: Speech Compiler for Your Language”, Proceedings of the European Conference on Speech Technology 87, Edinburgh, vol. 1, pp. 381–384.

    Google Scholar 

  • LINDSTRöM, A., M. LJUNGQVIST, and K. GUSTAFSONN, (1993), “A Modular Architecture Suppoorting Multiple Hypotheses for Conversion of Text to Phonetic and Linguistic Entities”, Proceedings of Eurospeech 93, Berlin, pp.1463–1466.

    Google Scholar 

  • LINDSTRöM, A., and M. LJUNGQVIST, (1994), “Text Processing within a Speech Synthesis System”, Proceedings of the International Conference on Speech and Language Processing 94, Yokohama.

    Google Scholar 

  • MAC ALLISTER, M., (1987), “The Structural Design of the CSTR Text-to-Speech System”, Proceedings of the European Conference on Speech Technology 87, Edinburgh, vol. 1, pp. 59–62.

    Google Scholar 

  • MEYER, P., H.W. RüHL, R. KRüGER, M. KUGLER, L.L.M. VöGTEN, A. DIRKSEN, and K. BELHOULA, (1993), “PHRITTS — A Text-To-Speech Synthesizer for the German Language”, Proceedings of Eurospeech 93, Berlin, pp. 877–890.

    Google Scholar 

  • SCHARF, L.L., (1991), “Reduced Rank Signal Processing”, Signal Processing, 25, n° 2, pp. 113–133.

    Article  MATH  Google Scholar 

  • TRABER, C., (1993), “Syntactic Processing and Prosody Control in the SVOX TTS System for German”, Proceedings of Eurospeech 93, Berlin, vol. 3, pp. 2099–2102.

    Google Scholar 

  • VAN COILE, B.M.J., (1989), “The DEPES Development System for Text-to-Speech Synthesis, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 89, Glasgow, pp. 250–253.

    Google Scholar 

  • VAN LEEUWEN, H.C., (1989), “A Development Tool for Linguistic Rules”, Computer, Speech and Language, n°3, pp. 83–104.

    Google Scholar 

  • VAN LEEUWEN, H.C., and E. TE LINDERT, (1993), “Speech Maker: a Flexible and General Framework for Text-to-Speech Synthesis, and its Application to Dutch”, Computer, Speech and Language, n°2, pp. 149–167.

    Google Scholar 

  • VAN LEEUWEN, H.C., (1993), “Speech Maker Formalism: a Rule Formalism Operating on a Multilevel Synchronized Data Structure”, Computer Speech and Language, n°. 4.

    Google Scholar 

  • WINSTON, P.H., (1979), Artificial Intelligence, Addison-Wesley, Reading, MA.

    MATH  Google Scholar 

  • YOUD, N.J., and F. FALLSIDE, (1989), “Driving a Speech Synthesizer from Conceptual Input in the Context of a Voice Dialogue System”, Proceedings of Eurospeech 89, Paris, pp. 514–517.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Dutoit, T. (1997). NLP Architectures for TTS Synthesis. In: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol 3. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5730-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-5730-8_3

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-0369-1

  • Online ISBN: 978-94-011-5730-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics