A Model of Timing for Nonsegmental Phonological Structure

  • John Local
  • Richard Ogden
Chapter

Abstract

Usually the problem of timing in speech synthesis is construed as the search for appropriate algorithms for altering durations of speech units under various conditions (e.g., stressed versus unstressed syllables, final versus non-final position, nature of surrounding segments). This chapter proposes a model of phonological representation and phonetic interpretation based on Firthian prosodic analysis [Fir57], which is instantiated in the YorkTalk speech generation system. In this model timing is treated as part of phonetic interpretation and not as an integral part of phonological representation. This leads us to explore the possibility that speech rhythm is the product of relationships between abstract constituents of linguistic structure of which there is no single optimal distinguished unit.

Keywords

Phonological Representation Speech Synthesis Acoustical Society ofAmerica Natural Speech Synthetic Speech 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Abe64]
    D. Abercrombie. Syllable quantity and enclitics in English. In Honour of Daniel Jones, D. Abercrombie, D. B. Fry, P. A. D. MacCarthy, N. C. Scott and J. L. Trim, eds. Longman Green, London, 216–222, 1964.Google Scholar
  2. [BG89]
    C. P. Browman and L. M. Goldstein. Towards an articulatory phonology. Phonololgy Yearbook 3:219–252, 1989.Google Scholar
  3. [CI91]
    W. N. Campbell and S. D. Isard. Segment durations in a syllable frame. J. Phonetics 19:37–47, 1991.Google Scholar
  4. [Car57]
    J. C. Carnochan. Gemination in Hausa. In Studies in Linguistic Analysis, Special Volume of the Philological Society, 2nd edition, 49–81, 1957.Google Scholar
  5. [CH68]
    N. Chomsky and M. Halle. The Sound Pattern of English. Harper & Row: New York, 1968.Google Scholar
  6. [CUB73]
    C. H. Coker, N. Umeda, and C. P. Browman. Automatic synthesis from ordinary English text. IEEE Transactions on Audio and Electroacoustics, AU-21, 3:293–298, 1973.CrossRefGoogle Scholar
  7. [Col92]
    J. C. Coleman. The phonetic interpretation of headed phonological structures containing overlapping constituents. Phonology Yearbook 9(1):1–44, 1992.CrossRefGoogle Scholar
  8. [Col93]
    J. C. Coleman. Polysyllabic words in the YorkTalk synthesis system. In Papers in Laboratory Phonology HI, P. Keating, ed. Cambridge University Press, 293–324, 1993.Google Scholar
  9. [Fir57]
    J. R. Firth. A synopsis of Linguistic Theory. In Studies in Linguistic Analysis, Special Volume of the Philological Society, 2nd edition, 1–32, 1957.Google Scholar
  10. [Fow80]
    C. A. Fowler. Coarticulation and theories of extrinsic timing. Journal of Phonetics 8:113–133, 1980.Google Scholar
  11. [Fow81]
    C. A. Fowler. A relationship between coarticulation and compensatory shortening. Phonetica 38:35–50, 1981.CrossRefGoogle Scholar
  12. [Fow83]
    C. A. Fowler. Converging sources of evidence for spoken and pereived rhythms of speech: Cyclic production of vowels in sequences of monosyllabic stress feet. Journal of Experimental Psychology: General 112:386–412, 1983.CrossRefGoogle Scholar
  13. [Hen49]
    E. J. A. Henderson. Prosodies in Siamese. Asia Major 1:198–215, 1949.Google Scholar
  14. [Hen52]
    E. J. A. Henderson. The phonology of loanwords in some South-East Asian languages. Transactions of the Philological Society 131–158, 1952.Google Scholar
  15. [Kel89]
    J. Kelly. Swahili phonologcal structure: A prosodic view. In Le Swahili et ses Limites, M. F. Rombi, ed. Editions Recherche sur les Civilisations, Paris, 25–31, 1989.Google Scholar
  16. [Kel92]
    J. Kelly. Systems for open syllabics in North Welsh. In Studies in Systemic Phonology, P. Tench, ed. Pinter Publishers, London and New York, 87–97, 1992.Google Scholar
  17. [Ken94]
    M. Kenstowicz. Phonology in Generative Grammar. Basil Blackwell, Oxford, 1994.Google Scholar
  18. [Kla]
    D. H. Klatt. Klattalk: The conversion of English text to speech. Unpublished manuscript, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  19. [Kla87]
    D. H. Klatt. Review of text-to-speech conversion for English. Journal of the Acoustical Society of America 82(3):737–793, 1987.CrossRefGoogle Scholar
  20. [LR73]
    B. Lindblom and K. Rapp. Some temporal regularities of spoken Swedish. Papers in Linguistics from the University of Stockholm 21:1–59, 1973.Google Scholar
  21. [Loc90]
    J. K. Local. Some rhythm, resonance and quality variations in urban Tyneside speech. In Studies in the Pronunciation of English: A Commemorative Volume in Honour of A C Gimson, S. Ramsaren, ed. Routledge, London, 286–292, 1990.Google Scholar
  22. [Loc92]
    J. K. Local. Modelling assimilation in a non-segmental rule-free phonology. In Papers in Laboratory Phonology II, G. J. Docherty and D. R. Ladd, eds. CUP, Cambridge, 190–223, 1992.Google Scholar
  23. [LO94]
    J. K. Local and R. A. Ogden. Temporal exponents of word-structure in English. York Research Papers in Linguistics. YLLS/RP 1994.Google Scholar
  24. [Man92]
    S. Y. Manuel, S. Shattuck-Hufnagel, M. Huffman, K. N. Stevens, R. Carlson, and S. Hunnicutt. Studies of vowel and consonant reduction. In Proceedings of ICSLP 2:943–946, 1992.Google Scholar
  25. [Ogd92]
    R. A. Ogden. Parametric interpretation in YorkTalk. York Papers in Linguistics 16:81–99, 1992.Google Scholar
  26. [Ogd93]
    R. A. Ogden. European Patent Application 93307872.7 — YorkTalk. 1993.Google Scholar
  27. [Par84]
    B. H. Partee. Compositionality. In Varieties of Formal Semantics, F. Landman and F. Veltman, eds. Foris, Dordrecht, 281–312, 1984.Google Scholar
  28. [Ril92]
    M. D. Riley. Tree-based modeling for speech synthesis. In Talking Machines: Theories, Models, and Designs, G. Bailly and C. Benoit, eds. Elsevier, North-Holland, Amsterdam, 265–273, 1992.Google Scholar
  29. [Sim92]
    A. Simpson. The phonologies of the English auxiliary system. In Who Climbs the Grammar Tree? R. Tracy, ed. Niemeyer, Tuebingen, 209–219, 1992.Google Scholar
  30. [Smi93]
    C. L. Smith. Prosodic patterns in the coordination of vowel and consonant gestures. Paper given at the Fourth Laboratory Phonology Meeting, Oxford, August, 1993.Google Scholar
  31. [Spr66]
    R. K. Sprigg. Vowel harmony in Lhasa Tibetan: Prosodic analysis applied to interrelated vocalic features of successive syllables. Bulletin of the School of Oriental and African Studies 24:116–138, 1966.CrossRefGoogle Scholar
  32. [van92]
    J. P. H. van Santen. Deriving text-to-speech durations from natural speech. In Talking Machines: Theories, Models, and Designs, G. Bailly and C. Benoit, eds. Elsevier, North-Holland, Amsterdam, 275–285, 1992.Google Scholar
  33. [van94]
    J. P. H. van Santen. Assignment of segmental duration in text-to-speech synthesis. Computer Speech & Language 8:95–128, 1994.CrossRefGoogle Scholar
  34. [VCR92]
    J. P. H. van Santen, J. Coleman, and M. Randolph. Effects of postvocalic voicing on the time course of vowels and diphthongs. Journal of the Acoustical Society of America 92:2444, 1992.Google Scholar
  35. [Whe81]
    D. Wheeler. Aspects of a Categorial Theory of Phonology. Graduate Linguistics Student Association, University of Massachusetts at Amherst, 1981.Google Scholar
  36. [Wii91]
    K. Wiik. On a third type of speech rhythm: Foot timing. In Proceedings of the Twelfth International Congress of Phonetic Sciences, Aix-en-Provence, 3:298–301, 1991.Google Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • John Local
  • Richard Ogden

There are no affiliations available

Personalised recommendations