Skip to main content

Phonetic Time Maps

Defining Constraints for Multilinear Speech Processing

  • Chapter
Book cover The Integration of Phonetic Knowledge in Speech Technology

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 25))

  • 415 Accesses

Abstract

This paper presents a constraint-based model for the interpretation of multilinear representations of speech utterances which can provide important fine-grained information for speech recognition applications. The model uses explicit structural constraints specifying time maps—overlap and precedence relations between features in both the phonological and the phonetic domains—in order to recognize well-formed syllable structures. In the phonological domain, these constraints together form a complete phonotactic description of the language, while in the phonetic domain, the constraints define the internal structure of phonological features based on phonetic realisations. The constraints are enhanced by a constraint relaxation procedure to cater for underspecified input and allow output representations to be extrapolated based on the phonetic and phonological information contained in the constraints and the rankings which have been assigned to them. This approach thus describes the integration of explicit phonetic knowledge into a computational linguistic model to improve robustness in speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ali, A.M..A., Van der Spiegel, J., Mueller, P., Haentjaens, G., and Berman, J. An Acoustic-Phonetic Feature-Based System for Automatic Phoneme Recognition in Continuous Speech. In: IEEE International Symposium on Circuits and Systems (ISCAS-99), 1999: III-118–III-121.

    Google Scholar 

  • Ashby, S., Carson-Berndsen, J. and Joue, G. A testbed for the development of multilingual phonotactic descriptions. In: Proceedings of Eurospeech 2001, Aalborg, 2001: 321–324.

    Google Scholar 

  • Bird, S. and Klein, E. Phonological Events. Journal of Linguistics 26 (1990): 33–56.

    Google Scholar 

  • Boersma, P. Functional Phonology. LOT, Netherlands Graduate School of Linguistics, The Hague. 1998.

    Google Scholar 

  • Browman, C.P. and Goldstein, L. Articulatory gestures as phonological units. In: Phonology 6, Cambridge University Press, Cambridge, 1989: 201–251.

    Google Scholar 

  • Carson, J. Unification and Transduction in Computational Phonology. In: Proceedings of the 12th International Conference on Computational Linguistics, Budapest, 1, 1988: 106–111.

    Google Scholar 

  • Carson-Berndsen, J. Phonological Processing of Speech Variants. In: Proceedings of the 13th International Conference on Computational Linguistics (COLING-90) Helsinki, 3, 1990: 21–24.

    Google Scholar 

  • Carson-Berndsen, J. Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht, 1998.

    Google Scholar 

  • Carson-Berndsen, J. A Generic Lexicon Tool for Word Model Definition in Multimodal Applications. In: Proceedings of EUROSPEECH 99, 6th European Conference on Speech Communication and Technology, Budapest, September 1999: 2235–2238.

    Google Scholar 

  • Carson-Berndsen, J. Finite State Models, Event Logics and Statistics in Speech Recognition. In: G. Gazdar, K. Sparck Jones, and R. Needham(eds.): Computers, Language and Speech: Integrating formal theories and statistical data. Philosophical Transactions of the Royal Society, Series A, 358(1770), 2000: 1255–1266.

    Google Scholar 

  • Carson-Berndsen, J. Multilingual Time Maps: Portable Phonotactic Models for Speech Technology. In: Proceedings of the LREC Workshop on Portability Issues in Human Language Technology. Las Palmas, May 2002.

    Google Scholar 

  • Carson-Berndsen, J. and Joue, G. Cognitive constraints in a computational linguistic model for speech recognition. In: Proceedings of the 11th Irish Conference on Artificial Intelligence and Cognitive Science, Galway, Ireland, 2000.

    Google Scholar 

  • Carson-Berndsen, J. and Walsh, M. Generic techniques for multilingual speech technology applications. In: Proceedings of the 7th Conference on Automatic Natural Language Processing, Lausanne, Switzerland, 2000a: 61–70.

    Google Scholar 

  • Carson-Berndsen, J. and Walsh, M. Interpreting Multilinear Representations in Speech. In: Proceedings of the 8th Australian Conference on Speech Science and Technology, Canberra, Australia, 2000b: 472–477.

    Google Scholar 

  • Carson-Berndsen, J., Joue, G. and Walsh, M. Phonotactic Constraint Ranking for Speech Recognition. In: W. Daelemans, K. Sima’an, J. Veenstra, and J. Zavrel (eds.): Computational Linguistics in the Netherlands 2000, Editions Rodopi b.v. Amsterdam, New York: 2001: 16–29.

    Google Scholar 

  • Chang, S., Shastri, L. and Greenberg, S. Automatic Phonetic Transcription of Spontaneous Speech (American English). In: ICSLP-2000, Beijing, October 2000.

    Google Scholar 

  • Chang, S., Greenberg, S. and Wester, M. An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proceedings of Eurospeech 2001, Aalborg, 2001: 1725–1728.

    Google Scholar 

  • Church, K.W. Phonological Parsing in Speech Recognition. Kluwer Academic Publishers, Boston, 1987.

    Google Scholar 

  • Coleman, J.S. and Local, J.K. Monostratal Phonology and Speech Synthesis. In: P. Tench (ed.): Studies in Systemic Phonology. London, Pinter Publishers. 1992: 183–193.

    Google Scholar 

  • Deng, L. Speech Recognition Using Autosegmental Representation of Phonological Units with Interface to the Trended HMM. Free Speech Journal, 1997.

    Google Scholar 

  • Deng, L. A dynamic feature-based approach to the interface between phonology and phonetics for speech modelling and recognition. Speech Communication 24 (1998): 299–323.

    Article  Google Scholar 

  • Goldsmith, J. Autosegmental Phonology. Indiana University Linguistics Club, Bloomington Indiana. 1976.

    Google Scholar 

  • Goldsmith, J. Autosegmental and Metrical Phonology. Basil Blackwell, Cambridge, MA, 1990.

    Google Scholar 

  • Greenberg, S. Speaking in shorthand-a syllable-centric perspective for understanding pronunciation variation. Speech Communication 29(2–4) (1999): 159–176.

    Article  Google Scholar 

  • Jusek, A., Rautenstrauch, H., Fink, G.A., Kummert, F., Sagerer, G., Carson-Berndsen, J. and Gibbon, D. Detektion unbekannter Wörter mit Hilfe phonotaktischer Modelle. In: Mustererkennung 94, 16. DAGM-Symposium Wien Berlin, Springer Verlag, 1994: 238–245.

    Google Scholar 

  • Kent, R.D. and Read, C. The Acoustic Analysis of Speech. Whurr Publishers, 1992.

    Google Scholar 

  • Koreman, J., Andreeva, B. and Strik, H. Acoustic Parameters Versus Phonetic Features in ASR. In: Proceedings of ICPhS 99, 1999: 719–722.

    Google Scholar 

  • Pols, L. Flexible, Robust, and Efficient Human Speech Processing Versus Present-day Speech Technology. In: Proceedings of ICPhS 99, 1999: 9–16.

    Google Scholar 

  • Salomon, A. and Espy-Wilson, C. Automatic Detection of Manner Events based on Temporal Parameters. In: Proceedings of EUROSPEECH 99, 6th European Conference on Speech Communication and Technology, 1999: 2797–2800.

    Google Scholar 

  • Stevens, K.N. Acoustic Phonetics. The MIT Press, Cambridge MA, London, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer

About this chapter

Cite this chapter

Carson-Berndsen, J., Walsh, M. (2005). Phonetic Time Maps. In: Barry, W.J., van Dommelen, W.A. (eds) The Integration of Phonetic Knowledge in Speech Technology. Text, Speech and Language Technology, vol 25. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2637-4_4

Download citation

Publish with us

Policies and ethics