Abstract
Natural languages dispose of a rich vocabulary. Although most natives hardly use more than 5,000 words,1 the Oxford Advanced Learner’s Dictionary of Current English (Hornby, 1974) and the French Petit Robert (Robert, 1972) both denote more than 50,000 lexical entries. And the latter is merely, with its 2,000 pages, a simplified version of the more exhaustive Robert (5,600 pages). Similarly, the Oxford English Dictionary contains more than 500,000 entries. Yet only full forms are included. Taking plural, feminine, and conjugated forms into account increases the total number of common words to be transcribed up to at least 1 million, not to mention proper nouns, the pronunciation of which is not always clearly established.
Let me take this opportunity of answering a question that has often been asked me, how to pronounce “slithy toves.” The “i” in “slithy” is long, as in “writhe”; and “toves” is pronounced so as to rhyme with “groves.” Again, the first “o” in “borogoves” is pronounced like the “o” in “borrow.” I have heard people try to give it the sound of the “o” in “worry”. Such is Human Perversity. Lewis Carroll, The Hunting of the Snake
“Mine is a long and a sad tale!” said the Mouse, turning to Alice, and sighing. “It IS a long tail, certainly,” said Alice, looking down with wonder at the Mouse’s tail; “but why do you call it sad?” Lewis Carroll, Alice’s Adventures in Wonderland
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AINSWORTH, W.A., and B. PELL, (1989), “Connectionist Architectures for a Text-to-Speech System”, Proceedings of Eurospeech 89, vol. 1, pp. 125–128.
ALLEN, J., (1976), “Synthesis of Speech from Unrestricted Text”, Proceedings of the IEEE, vol. 64, n°4, pp. 433–442.
ALLEN, J., S. HUNNICUT, and D. KLATT, (1987), From Text To Speech, The MITALK System, Cambridge University Press, Cambridge.
BELRHALI, R., V. AUBERGE, and L.J. BOE, (1992), “From Lexicon to Rules: Towards a Descriptive Method of French Text-to-Phonetics Transcription”, Proceedings of the International Conference on Spoken Language Processing 92, Alberta, pp. 1183–1186.
BELHOULA, K., (1993), “Rule-based Grapheme-to-Phoneme Conversion of Names”, Proceedings of Eurospeech 93, Berlin, pp. 881–884.
BÖHM, A., (1992), Maschinelle Sprachausgabe Deutschen und Englishe Textes, Ph.D. dissertation, Ruhr-Universität Bochum.
BOURCIEZ, E., (1958), Précis de Phonétique Française, Klincksieck, Paris.
CHOMSKY, N., and M. HALLE, (1968), The Sound Pattern of English, Harper and Row, New York.
CHURCH, K. W., (1986), “Stress Assignment in Letter-to-Sound Rules for Speech Synthesis”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 86, Tokyo, pp. 2423–2427.
COKER, C.H., (1985), “A Dictionary-Intensive Letter-to-Sound Program”, Journal of the Acoustical Society of America, suppl. 1, n°78, S7.
COKER, C.H., K.W. CHURCH, and M.Y. LIBERMAN, (1990), “Morphology and Rhyming: Two Powerful Alternatives to Letter-to-Sound Rules for Speech Synthesis”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans (France), pp. 83–86.
DAELEMANS, W., and A. VAN DEN BOSCH, (1993), “TabTalk: Reusability in data-oriented grapheme-to-phoneme conversion”, Proceedings of Eurospeech 93, Berlin, pp. 1459–1462.
DELATTRE, P., (1966), Studies in French and Comparated Phonetics, Mouton, The Hague.
DUTOIT, T., (1993), High Quality Text-To-Speech Synthesis of the French Language, Ph. D. dissertation, Faculté Polytechnique de Mons.
ESPRIT-SAM (1992), “Standard Computer-Compatible Transcription”, ESPRIT Project 2589, Stage Report Sen.93, SAM-UCL-037.
FISHER, W., V. ZUE, J. BERNSTEIN, and D. PALLETT, (1987), “An Acoustic-Phonetic Data Bas”, Journal of the Acoustical Society of America, Suppl. 1, 81.
HORNBY, A.S., (ed), (1974), Oxford Advanced Learner’s Dictionary of Current English, 3rd edition, Oxford University Press, Oxford.
HUNNICUT S., (1980), “Grapheme-to-Phoneme Rules: a Review”, Speech Transmission Laboratory, QPSR 2-3, Royal Institute of Technology, Stockholm, Sweden, pp. 38–60.
IMBS, P., (1971), Etudes Statistiques sur le Vocabulaire Français. Dictionnaire des Fréquences. Centre de Recherches pour un Trésor de la Langue Française (CNRS), Marcel Didier, Nançy, Paris.
INALF, (1984), Dictionnaire des Fréquences: Table de Répartition des Homographes, CNRS — INALF (Institut National de la Langue Française), Nancy.
KARTUNNEN, L., K. KOSKENNIEMI, and R.M. KAPLAN, (1987), “A Compiler for Two-LevelPhonological Rules”, Report n° CSLI-87-108, Center for the Study of Language and Information, Stanford University.
KLATT, D., and D. SHIPMAN, (1982), “Letter-to-Phoneme Rules: a Semiautomatic Discovery Procedure”, Journal of the Acoustical Society of America, suppl 1, n° 72, S.48.
KLATT, D. H., (1987), “Review of Text-to-Speech Conversion for English”, Journal of the Acoustical Society of America, 82, 3, pp. 737–793.
KOSKENNIEMI, K., (1983), Two Level Morphology: A General Computational Model for Word-Form Recognition and Production, Ph.D. dissertation, Department of General Linguistics, University of Helsinki.
LAWRENCE, S.G.C., and G. KAYE, (1986), “Alignment of Phonemes with their Corresponding Orthography”, Computer Speech and Language, 1, pp. 153–165.
LENNIG, M., and J.P. BRASSARD, (1984), “Machine Readable Phonetic Alphabet for English and French”, Speech Communication, 3, pp. 165–166.
LEON, P.R., (1966), Pronunciation du Français Standard, Didier, Paris.
LEVEMSON, S.E., J.P. OLIVE, and J.S. TSCHIRGI, (1993), “Speech Synthesis in Telecommunications”, IEEE Communications Magazine, pp. 46–53.
LIBERMAN, M.J., (1979), “Phonemic Transcription, Stress, and Segment Durations for Spelled Proper Names”, Journal of the Acoustical Society of America, Suppl.1, 64, S. 163.
LIBERMAN, M.J., and K.W. CHURCH, (1992), “Text Analysis and Word Pronunciation in Text-to-Speech Synthesis”, in Advances in Speech Signal Processing, S. Furuy, M.M. Sondhi, eds., Dekker, New York, pp.791–831.
LUCASSEN, J.M. and R.L. MERCER, (1984), “An Information Theoretic Approach to the Automatic Determination of Phoneme Base Forms”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 84, 52.5, pp. 42–45.
LUK, R.W.P., and DAMPER, R.L, (1996), “Stochastic Phonographic Transduction for English”, Computer Speech and Language, n°10, 133–153.
MATSUMUTO, T. and Y. YAMAGUCHI, (1990), “A Multi-Language Text-to-Speech System Using Neural Networks”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans, pp. 269–272.
MANNEL, R., and J.E. CLARK, (1987), “Text-to-Speech Rule and Dictionary Development”, Speech Communication, n°6, pp. 317–324.
MC CULLOCH, N., M. BEDWORTH, and J. BRIDLE, (1987), “NETSPEAK — a re-implementation of NETTALK”, Computer Speech and Language, 2, pp. 289–301.
NYROP, K., (1963), Manuel Phonétique du Français Parlé, Gylendal, Paris.
RILEY, M.D., (1991), “A Statistical Model for Generating Pronunciation Networks”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, pp. 737–740
ROBERT, P., (1972), Le Petit Robert, Société du Nouveau Littré, Paris.
SCHMIDT, M., S. FITT, C. SCOTT, and M. JACK, (193), “Phonetic Transcription Standards for European Names (ONOMASTICA)”, Proceedings of Eurospeech 93, Berlin, pp. 279–282. (see also URL address: http://guagua.echo.lu/langeng/en/lrel/onomas.html)
SEJNOWSKI, T., and C.R. ROSENBERG, (1987), “Parallel Networks that Learn to Pronounce English Text”, Complex Systems, n°l, pp. 145–168.
SHOUP, J. E., (1980), “Phonological Aspects of Speech Recognition”, in Trends in Speech Recognition, W.A. Lea, ed., Prentice-Hall, New-York, pp. 125–138.
TORKKOLA, K., (1993), “An Efficient Way to Learn English Grapheme-to-Phoneme Rules Automatically”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 93, pp. II.199–II.202.
VAN COILE, B., (1991), “Inductive Learning of Pronunciation Rules with the DEPES System”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, Toronto, vol. 2, pp. 745–748.
VAN COILE, B., (1993), “On the Development of Pronunciation Rules for Text-to-Speech Synthesis”, Proceedings of Eurospeech 93, Berlin, vol. 2, pp. 1455–1458.
WILLIAMS, B., (1994), “Welsh Letter-to-Sound Rules: Rewrite Rules and Two-Level Rules Compared”, Computer Speech and Language, n°8, 261–277.
WITHGOTT, M. M., and F.R. CHEN, (1993), Computational Models of American English, CSLI Lecture Notes,, n°32, Center for the Study of Language and Information, Stanford University.
WITTEN, I.H., (1982), Principles of Computer Speech, Academic Press.
YVON, F., (1996), “Grapheme-to-Phoneme Conversion of Multiple Unbounded Overlapping Chunks”, CMP-LG, paper n° 960800621
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Dutoit, T. (1997). Automatic Phonetization. In: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol 3. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5730-8_5
Download citation
DOI: https://doi.org/10.1007/978-94-011-5730-8_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-0369-1
Online ISBN: 978-94-011-5730-8
eBook Packages: Springer Book Archive