Automatic Phonetization

Dutoit, Thierry

doi:10.1007/978-94-011-5730-8_5

Thierry Dutoit⁴

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 3))

357 Accesses
2 Citations

Abstract

Natural languages dispose of a rich vocabulary. Although most natives hardly use more than 5,000 words,¹ the Oxford Advanced Learner’s Dictionary of Current English (Hornby, 1974) and the French Petit Robert (Robert, 1972) both denote more than 50,000 lexical entries. And the latter is merely, with its 2,000 pages, a simplified version of the more exhaustive Robert (5,600 pages). Similarly, the Oxford English Dictionary contains more than 500,000 entries. Yet only full forms are included. Taking plural, feminine, and conjugated forms into account increases the total number of common words to be transcribed up to at least 1 million, not to mention proper nouns, the pronunciation of which is not always clearly established.

Let me take this opportunity of answering a question that has often been asked me, how to pronounce “slithy toves.” The “i” in “slithy” is long, as in “writhe”; and “toves” is pronounced so as to rhyme with “groves.” Again, the first “o” in “borogoves” is pronounced like the “o” in “borrow.” I have heard people try to give it the sound of the “o” in “worry”. Such is Human Perversity. Lewis Carroll, The Hunting of the Snake

“Mine is a long and a sad tale!” said the Mouse, turning to Alice, and sighing. “It IS a long tail, certainly,” said Alice, looking down with wonder at the Mouse’s tail; “but why do you call it sad?” Lewis Carroll, Alice’s Adventures in Wonderland

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AINSWORTH, W.A., and B. PELL, (1989), “Connectionist Architectures for a Text-to-Speech System”, Proceedings of Eurospeech 89, vol. 1, pp. 125–128.
Google Scholar
ALLEN, J., (1976), “Synthesis of Speech from Unrestricted Text”, Proceedings of the IEEE, vol. 64, n°4, pp. 433–442.
Article Google Scholar
ALLEN, J., S. HUNNICUT, and D. KLATT, (1987), From Text To Speech, The MITALK System, Cambridge University Press, Cambridge.
Google Scholar
BELRHALI, R., V. AUBERGE, and L.J. BOE, (1992), “From Lexicon to Rules: Towards a Descriptive Method of French Text-to-Phonetics Transcription”, Proceedings of the International Conference on Spoken Language Processing 92, Alberta, pp. 1183–1186.
Google Scholar
BELHOULA, K., (1993), “Rule-based Grapheme-to-Phoneme Conversion of Names”, Proceedings of Eurospeech 93, Berlin, pp. 881–884.
Google Scholar
BÖHM, A., (1992), Maschinelle Sprachausgabe Deutschen und Englishe Textes, Ph.D. dissertation, Ruhr-Universität Bochum.
Google Scholar
BOURCIEZ, E., (1958), Précis de Phonétique Française, Klincksieck, Paris.
Google Scholar
CHOMSKY, N., and M. HALLE, (1968), The Sound Pattern of English, Harper and Row, New York.
Google Scholar
CHURCH, K. W., (1986), “Stress Assignment in Letter-to-Sound Rules for Speech Synthesis”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 86, Tokyo, pp. 2423–2427.
Google Scholar
COKER, C.H., (1985), “A Dictionary-Intensive Letter-to-Sound Program”, Journal of the Acoustical Society of America, suppl. 1, n°78, S7.
Article Google Scholar
COKER, C.H., K.W. CHURCH, and M.Y. LIBERMAN, (1990), “Morphology and Rhyming: Two Powerful Alternatives to Letter-to-Sound Rules for Speech Synthesis”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans (France), pp. 83–86.
Google Scholar
DAELEMANS, W., and A. VAN DEN BOSCH, (1993), “TabTalk: Reusability in data-oriented grapheme-to-phoneme conversion”, Proceedings of Eurospeech 93, Berlin, pp. 1459–1462.
Google Scholar
DELATTRE, P., (1966), Studies in French and Comparated Phonetics, Mouton, The Hague.
Google Scholar
DUTOIT, T., (1993), High Quality Text-To-Speech Synthesis of the French Language, Ph. D. dissertation, Faculté Polytechnique de Mons.
Google Scholar
ESPRIT-SAM (1992), “Standard Computer-Compatible Transcription”, ESPRIT Project 2589, Stage Report Sen.93, SAM-UCL-037.
Google Scholar
FISHER, W., V. ZUE, J. BERNSTEIN, and D. PALLETT, (1987), “An Acoustic-Phonetic Data Bas”, Journal of the Acoustical Society of America, Suppl. 1, 81.
Google Scholar
HORNBY, A.S., (ed), (1974), Oxford Advanced Learner’s Dictionary of Current English, 3rd edition, Oxford University Press, Oxford.
Google Scholar
HUNNICUT S., (1980), “Grapheme-to-Phoneme Rules: a Review”, Speech Transmission Laboratory, QPSR 2-3, Royal Institute of Technology, Stockholm, Sweden, pp. 38–60.
Google Scholar
IMBS, P., (1971), Etudes Statistiques sur le Vocabulaire Français. Dictionnaire des Fréquences. Centre de Recherches pour un Trésor de la Langue Française (CNRS), Marcel Didier, Nançy, Paris.
Google Scholar
INALF, (1984), Dictionnaire des Fréquences: Table de Répartition des Homographes, CNRS — INALF (Institut National de la Langue Française), Nancy.
Google Scholar
KARTUNNEN, L., K. KOSKENNIEMI, and R.M. KAPLAN, (1987), “A Compiler for Two-LevelPhonological Rules”, Report n° CSLI-87-108, Center for the Study of Language and Information, Stanford University.
Google Scholar
KLATT, D., and D. SHIPMAN, (1982), “Letter-to-Phoneme Rules: a Semiautomatic Discovery Procedure”, Journal of the Acoustical Society of America, suppl 1, n° 72, S.48.
Google Scholar
KLATT, D. H., (1987), “Review of Text-to-Speech Conversion for English”, Journal of the Acoustical Society of America, 82, 3, pp. 737–793.
Article Google Scholar
KOSKENNIEMI, K., (1983), Two Level Morphology: A General Computational Model for Word-Form Recognition and Production, Ph.D. dissertation, Department of General Linguistics, University of Helsinki.
Google Scholar
LAWRENCE, S.G.C., and G. KAYE, (1986), “Alignment of Phonemes with their Corresponding Orthography”, Computer Speech and Language, 1, pp. 153–165.
Article Google Scholar
LENNIG, M., and J.P. BRASSARD, (1984), “Machine Readable Phonetic Alphabet for English and French”, Speech Communication, 3, pp. 165–166.
Article Google Scholar
LEON, P.R., (1966), Pronunciation du Français Standard, Didier, Paris.
Google Scholar
LEVEMSON, S.E., J.P. OLIVE, and J.S. TSCHIRGI, (1993), “Speech Synthesis in Telecommunications”, IEEE Communications Magazine, pp. 46–53.
Google Scholar
LIBERMAN, M.J., (1979), “Phonemic Transcription, Stress, and Segment Durations for Spelled Proper Names”, Journal of the Acoustical Society of America, Suppl.1, 64, S. 163.
MathSciNet Google Scholar
LIBERMAN, M.J., and K.W. CHURCH, (1992), “Text Analysis and Word Pronunciation in Text-to-Speech Synthesis”, in Advances in Speech Signal Processing, S. Furuy, M.M. Sondhi, eds., Dekker, New York, pp.791–831.
Google Scholar
LUCASSEN, J.M. and R.L. MERCER, (1984), “An Information Theoretic Approach to the Automatic Determination of Phoneme Base Forms”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 84, 52.5, pp. 42–45.
Google Scholar
LUK, R.W.P., and DAMPER, R.L, (1996), “Stochastic Phonographic Transduction for English”, Computer Speech and Language, n°10, 133–153.
Google Scholar
MATSUMUTO, T. and Y. YAMAGUCHI, (1990), “A Multi-Language Text-to-Speech System Using Neural Networks”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans, pp. 269–272.
Google Scholar
MANNEL, R., and J.E. CLARK, (1987), “Text-to-Speech Rule and Dictionary Development”, Speech Communication, n°6, pp. 317–324.
Google Scholar
MC CULLOCH, N., M. BEDWORTH, and J. BRIDLE, (1987), “NETSPEAK — a re-implementation of NETTALK”, Computer Speech and Language, 2, pp. 289–301.
Article Google Scholar
NYROP, K., (1963), Manuel Phonétique du Français Parlé, Gylendal, Paris.
Google Scholar
RILEY, M.D., (1991), “A Statistical Model for Generating Pronunciation Networks”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, pp. 737–740
Google Scholar
ROBERT, P., (1972), Le Petit Robert, Société du Nouveau Littré, Paris.
Google Scholar
SCHMIDT, M., S. FITT, C. SCOTT, and M. JACK, (193), “Phonetic Transcription Standards for European Names (ONOMASTICA)”, Proceedings of Eurospeech 93, Berlin, pp. 279–282. (see also URL address: http://guagua.echo.lu/langeng/en/lrel/onomas.html)
SEJNOWSKI, T., and C.R. ROSENBERG, (1987), “Parallel Networks that Learn to Pronounce English Text”, Complex Systems, n°l, pp. 145–168.
Google Scholar
SHOUP, J. E., (1980), “Phonological Aspects of Speech Recognition”, in Trends in Speech Recognition, W.A. Lea, ed., Prentice-Hall, New-York, pp. 125–138.
Google Scholar
TORKKOLA, K., (1993), “An Efficient Way to Learn English Grapheme-to-Phoneme Rules Automatically”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 93, pp. II.199–II.202.
Google Scholar
VAN COILE, B., (1991), “Inductive Learning of Pronunciation Rules with the DEPES System”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, Toronto, vol. 2, pp. 745–748.
Google Scholar
VAN COILE, B., (1993), “On the Development of Pronunciation Rules for Text-to-Speech Synthesis”, Proceedings of Eurospeech 93, Berlin, vol. 2, pp. 1455–1458.
Google Scholar
WILLIAMS, B., (1994), “Welsh Letter-to-Sound Rules: Rewrite Rules and Two-Level Rules Compared”, Computer Speech and Language, n°8, 261–277.
Google Scholar
WITHGOTT, M. M., and F.R. CHEN, (1993), Computational Models of American English, CSLI Lecture Notes,, n°32, Center for the Study of Language and Information, Stanford University.
Google Scholar
WITTEN, I.H., (1982), Principles of Computer Speech, Academic Press.
MATH Google Scholar
YVON, F., (1996), “Grapheme-to-Phoneme Conversion of Multiple Unbounded Overlapping Chunks”, CMP-LG, paper n° 9608006²¹
Google Scholar

Download references

Author information

Authors and Affiliations

Faculté Polytechnique de Mons, Mons, Belgium
Thierry Dutoit

Authors

Thierry Dutoit
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dutoit, T. (1997). Automatic Phonetization. In: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol 3. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5730-8_5

Download citation

DOI: https://doi.org/10.1007/978-94-011-5730-8_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-0369-1
Online ISBN: 978-94-011-5730-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics