NLP Architectures for TTS Synthesis

Dutoit, Thierry

doi:10.1007/978-94-011-5730-8_3

NLP Architectures for TTS Synthesis

Thierry Dutoit⁴

Chapter

362 Accesses

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 3))

Abstract

Text analysis and, better still, text understanding are some of the biggest challenges taken up by artificial intelligence. Over the last thirty-five years, computational linguistics specialists have proposed an impressive number of linguistic formalisms and inference methods to tackle these problems. Is it not surprising, under these conditions, that few of the speech synthesis systems commercialized up to now embody a somewhat complete parser to properly examine the sentences to pronounce? And how do we explain the desperate eagerness of specialists to obtain a reasonable synthetic speech quality with just some simple syntactic heuristics, while research teams in natural language processing (NLP) now focus on the much more complex task of analyzing semantics and pragmatics in the context of automatic understanding, translation, or production of natural language?

La grammaire est l’art de lever les ambiguités de la langue; mais il nefautpas que le levier soit plus lourd quelefardeau. (Grammar is the art of removing the ambiguities of a language; but the lever should not be heavier than the burden). A. Rivarol, (1784), De l’Universalité de la Langue

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ALLEN, J., S. HUNNICUT, and D. KLATT, (1987), From Text to Speech, The MITALK System, Cambridge University Press, Cambridge.
Google Scholar
BäCKSTRöM, M., K. CEDER, and B. LYBERG, (1989), “PROPHON- An Interactive Environment for Text-To-Speech Conversion”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 144–148.
Google Scholar
BAILLY, G., and A. TRAN, (1989), “Compost, a Rule-Compiler for Speech Synthesis”, Proceedings of Eurospeech 89, Paris, pp. 136–139.
Google Scholar
BöHM, A., (1992), Maschinelle Sprachausgabe Deutschen und Englishe Textes, Ph.D. dissertation, Ruhr-Universität Bochum.
Google Scholar
CARLSON, R., and B. GRANSTRöM, (1976), “A Text-to-Speech System Based Entirely on Rules”, Proceedings of ICASSP 76, Philadelphia, pp. 686–688.
Google Scholar
CEDER, K., and B. LIBERG, (1992), “Yet Another Rule Compiler for Text-to-Speech Conversion?”, Proceedings of the International Conference on Spoken Language Processing, Alberta, pp. 1151–1154.
Google Scholar
CERICOLA, D., M. DANIELI, M.J. MOLLO, and D. VOLTOLINI, (1989), “Morpho-Syntactic Tools for Speech Processing”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 386–389.
Google Scholar
DUTOIT, T., (1993), High Quality Text-To-Speech Synthesis of the French Language, Ph. D. dissertation, Faculté Polytechnique de Mons.
Google Scholar
ENGELMORE R., and T. MORGAN, (1988), Blackboard Systems, Addison-Wesley, Padstow, Great-Britain.
Google Scholar
ERMAN, L.D., F. HAYES-ROTH, V. LESSER, and D.R. REDDY, (1980), “The HEARSAY II Speech Understanding System: Integrating Knowledge to Resolve Uncertainty”, Computing Surveys, 12, 2, pp. 213–253.
Article Google Scholar
FRENKENBERGER, S., M. KOMMENDA, and S. MOSSMüLLER, (1991), “A Multi-Linear Representation and Rule Formalism for Phonetics in Text-to-Speech Synthesis”, Proeedings of the XII’ th International Congress of Phonetic Sciences, Aix-en-Provence, vol. 2, pp. 518–521.
Google Scholar
HERTZ, S.S., (1982), “From Text to Speech with SRS”, Journal of the Acoustical Society of America, n° 72, pp. 1155–1170.
Google Scholar
HERTZ, S.R., J. KADIN, and K. KARPLUS, (1985), “The DELTA Rule Ddevelopment System for Speech Synthesis from Text”, IEEE Proceedings, on Acoustics, Speech, and Signal Processing, n°73, pp. 1589–1601.
Google Scholar
KARLSSON, F., (1990), “Constraint Grammars as a Framework for Parsing Running Text”, Proceedings of the Conference on Computational Linguistics, Helsinki, vol. 3, pp. 168–173.
Google Scholar
LAZZARETTO, S., and S. NEBBIA, (1987), “SCYLA: Speech Compiler for Your Language”, Proceedings of the European Conference on Speech Technology 87, Edinburgh, vol. 1, pp. 381–384.
Google Scholar
LINDSTRöM, A., M. LJUNGQVIST, and K. GUSTAFSONN, (1993), “A Modular Architecture Suppoorting Multiple Hypotheses for Conversion of Text to Phonetic and Linguistic Entities”, Proceedings of Eurospeech 93, Berlin, pp.1463–1466.
Google Scholar
LINDSTRöM, A., and M. LJUNGQVIST, (1994), “Text Processing within a Speech Synthesis System”, Proceedings of the International Conference on Speech and Language Processing 94, Yokohama.
Google Scholar
MAC ALLISTER, M., (1987), “The Structural Design of the CSTR Text-to-Speech System”, Proceedings of the European Conference on Speech Technology 87, Edinburgh, vol. 1, pp. 59–62.
Google Scholar
MEYER, P., H.W. RüHL, R. KRüGER, M. KUGLER, L.L.M. VöGTEN, A. DIRKSEN, and K. BELHOULA, (1993), “PHRITTS — A Text-To-Speech Synthesizer for the German Language”, Proceedings of Eurospeech 93, Berlin, pp. 877–890.
Google Scholar
SCHARF, L.L., (1991), “Reduced Rank Signal Processing”, Signal Processing, 25, n° 2, pp. 113–133.
Article MATH Google Scholar
TRABER, C., (1993), “Syntactic Processing and Prosody Control in the SVOX TTS System for German”, Proceedings of Eurospeech 93, Berlin, vol. 3, pp. 2099–2102.
Google Scholar
VAN COILE, B.M.J., (1989), “The DEPES Development System for Text-to-Speech Synthesis, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 89, Glasgow, pp. 250–253.
Google Scholar
VAN LEEUWEN, H.C., (1989), “A Development Tool for Linguistic Rules”, Computer, Speech and Language, n°3, pp. 83–104.
Google Scholar
VAN LEEUWEN, H.C., and E. TE LINDERT, (1993), “Speech Maker: a Flexible and General Framework for Text-to-Speech Synthesis, and its Application to Dutch”, Computer, Speech and Language, n°2, pp. 149–167.
Google Scholar
VAN LEEUWEN, H.C., (1993), “Speech Maker Formalism: a Rule Formalism Operating on a Multilevel Synchronized Data Structure”, Computer Speech and Language, n°. 4.
Google Scholar
WINSTON, P.H., (1979), Artificial Intelligence, Addison-Wesley, Reading, MA.
MATH Google Scholar
YOUD, N.J., and F. FALLSIDE, (1989), “Driving a Speech Synthesizer from Conceptual Input in the Context of a Voice Dialogue System”, Proceedings of Eurospeech 89, Paris, pp. 514–517.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculté Polytechnique de Mons, Mons, Belgium
Thierry Dutoit

Authors

Thierry Dutoit
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dutoit, T. (1997). NLP Architectures for TTS Synthesis. In: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol 3. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5730-8_3

Download citation

DOI: https://doi.org/10.1007/978-94-011-5730-8_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-0369-1
Online ISBN: 978-94-011-5730-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics