Abstract
Usually the problem of timing in speech synthesis is construed as the search for appropriate algorithms for altering durations of speech units under various conditions (e.g., stressed versus unstressed syllables, final versus non-final position, nature of surrounding segments). This chapter proposes a model of phonological representation and phonetic interpretation based on Firthian prosodic analysis [Fir57], which is instantiated in the YorkTalk speech generation system. In this model timing is treated as part of phonetic interpretation and not as an integral part of phonological representation. This leads us to explore the possibility that speech rhythm is the product of relationships between abstract constituents of linguistic structure of which there is no single optimal distinguished unit.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. Abercrombie. Syllable quantity and enclitics in English. In Honour of Daniel Jones, D. Abercrombie, D. B. Fry, P. A. D. MacCarthy, N. C. Scott and J. L. Trim, eds. Longman Green, London, 216–222, 1964.
C. P. Browman and L. M. Goldstein. Towards an articulatory phonology. Phonololgy Yearbook 3:219–252, 1989.
W. N. Campbell and S. D. Isard. Segment durations in a syllable frame. J. Phonetics 19:37–47, 1991.
J. C. Carnochan. Gemination in Hausa. In Studies in Linguistic Analysis, Special Volume of the Philological Society, 2nd edition, 49–81, 1957.
N. Chomsky and M. Halle. The Sound Pattern of English. Harper & Row: New York, 1968.
C. H. Coker, N. Umeda, and C. P. Browman. Automatic synthesis from ordinary English text. IEEE Transactions on Audio and Electroacoustics, AU-21, 3:293–298, 1973.
J. C. Coleman. The phonetic interpretation of headed phonological structures containing overlapping constituents. Phonology Yearbook 9(1):1–44, 1992.
J. C. Coleman. Polysyllabic words in the YorkTalk synthesis system. In Papers in Laboratory Phonology HI, P. Keating, ed. Cambridge University Press, 293–324, 1993.
J. R. Firth. A synopsis of Linguistic Theory. In Studies in Linguistic Analysis, Special Volume of the Philological Society, 2nd edition, 1–32, 1957.
C. A. Fowler. Coarticulation and theories of extrinsic timing. Journal of Phonetics 8:113–133, 1980.
C. A. Fowler. A relationship between coarticulation and compensatory shortening. Phonetica 38:35–50, 1981.
C. A. Fowler. Converging sources of evidence for spoken and pereived rhythms of speech: Cyclic production of vowels in sequences of monosyllabic stress feet. Journal of Experimental Psychology: General 112:386–412, 1983.
E. J. A. Henderson. Prosodies in Siamese. Asia Major 1:198–215, 1949.
E. J. A. Henderson. The phonology of loanwords in some South-East Asian languages. Transactions of the Philological Society 131–158, 1952.
J. Kelly. Swahili phonologcal structure: A prosodic view. In Le Swahili et ses Limites, M. F. Rombi, ed. Editions Recherche sur les Civilisations, Paris, 25–31, 1989.
J. Kelly. Systems for open syllabics in North Welsh. In Studies in Systemic Phonology, P. Tench, ed. Pinter Publishers, London and New York, 87–97, 1992.
M. Kenstowicz. Phonology in Generative Grammar. Basil Blackwell, Oxford, 1994.
D. H. Klatt. Klattalk: The conversion of English text to speech. Unpublished manuscript, Massachusetts Institute of Technology, Cambridge, MA.
D. H. Klatt. Review of text-to-speech conversion for English. Journal of the Acoustical Society of America 82(3):737–793, 1987.
B. Lindblom and K. Rapp. Some temporal regularities of spoken Swedish. Papers in Linguistics from the University of Stockholm 21:1–59, 1973.
J. K. Local. Some rhythm, resonance and quality variations in urban Tyneside speech. In Studies in the Pronunciation of English: A Commemorative Volume in Honour of A C Gimson, S. Ramsaren, ed. Routledge, London, 286–292, 1990.
J. K. Local. Modelling assimilation in a non-segmental rule-free phonology. In Papers in Laboratory Phonology II, G. J. Docherty and D. R. Ladd, eds. CUP, Cambridge, 190–223, 1992.
J. K. Local and R. A. Ogden. Temporal exponents of word-structure in English. York Research Papers in Linguistics. YLLS/RP 1994.
S. Y. Manuel, S. Shattuck-Hufnagel, M. Huffman, K. N. Stevens, R. Carlson, and S. Hunnicutt. Studies of vowel and consonant reduction. In Proceedings of ICSLP 2:943–946, 1992.
R. A. Ogden. Parametric interpretation in YorkTalk. York Papers in Linguistics 16:81–99, 1992.
R. A. Ogden. European Patent Application 93307872.7 — YorkTalk. 1993.
B. H. Partee. Compositionality. In Varieties of Formal Semantics, F. Landman and F. Veltman, eds. Foris, Dordrecht, 281–312, 1984.
M. D. Riley. Tree-based modeling for speech synthesis. In Talking Machines: Theories, Models, and Designs, G. Bailly and C. Benoit, eds. Elsevier, North-Holland, Amsterdam, 265–273, 1992.
A. Simpson. The phonologies of the English auxiliary system. In Who Climbs the Grammar Tree? R. Tracy, ed. Niemeyer, Tuebingen, 209–219, 1992.
C. L. Smith. Prosodic patterns in the coordination of vowel and consonant gestures. Paper given at the Fourth Laboratory Phonology Meeting, Oxford, August, 1993.
R. K. Sprigg. Vowel harmony in Lhasa Tibetan: Prosodic analysis applied to interrelated vocalic features of successive syllables. Bulletin of the School of Oriental and African Studies 24:116–138, 1966.
J. P. H. van Santen. Deriving text-to-speech durations from natural speech. In Talking Machines: Theories, Models, and Designs, G. Bailly and C. Benoit, eds. Elsevier, North-Holland, Amsterdam, 275–285, 1992.
J. P. H. van Santen. Assignment of segmental duration in text-to-speech synthesis. Computer Speech & Language 8:95–128, 1994.
J. P. H. van Santen, J. Coleman, and M. Randolph. Effects of postvocalic voicing on the time course of vowels and diphthongs. Journal of the Acoustical Society of America 92:2444, 1992.
D. Wheeler. Aspects of a Categorial Theory of Phonology. Graduate Linguistics Student Association, University of Massachusetts at Amherst, 1981.
K. Wiik. On a third type of speech rhythm: Foot timing. In Proceedings of the Twelfth International Congress of Phonetic Sciences, Aix-en-Provence, 3:298–301, 1991.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media New York
About this chapter
Cite this chapter
Local, J., Ogden, R. (1997). A Model of Timing for Nonsegmental Phonological Structure. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds) Progress in Speech Synthesis. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1894-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-4612-1894-4_9
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7328-8
Online ISBN: 978-1-4612-1894-4
eBook Packages: Springer Book Archive