Phonetic Time Maps

Carson-Berndsen, Julie; Walsh, Michael

doi:10.1007/1-4020-2637-4_4

Julie Carson-Berndsen¹³ &
Michael Walsh¹³

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 25))

415 Accesses

Abstract

This paper presents a constraint-based model for the interpretation of multilinear representations of speech utterances which can provide important fine-grained information for speech recognition applications. The model uses explicit structural constraints specifying time maps—overlap and precedence relations between features in both the phonological and the phonetic domains—in order to recognize well-formed syllable structures. In the phonological domain, these constraints together form a complete phonotactic description of the language, while in the phonetic domain, the constraints define the internal structure of phonological features based on phonetic realisations. The constraints are enhanced by a constraint relaxation procedure to cater for underspecified input and allow output representations to be extrapolated based on the phonetic and phonological information contained in the constraints and the rankings which have been assigned to them. This approach thus describes the integration of explicit phonetic knowledge into a computational linguistic model to improve robustness in speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ali, A.M..A., Van der Spiegel, J., Mueller, P., Haentjaens, G., and Berman, J. An Acoustic-Phonetic Feature-Based System for Automatic Phoneme Recognition in Continuous Speech. In: IEEE International Symposium on Circuits and Systems (ISCAS-99), 1999: III-118–III-121.
Google Scholar
Ashby, S., Carson-Berndsen, J. and Joue, G. A testbed for the development of multilingual phonotactic descriptions. In: Proceedings of Eurospeech 2001, Aalborg, 2001: 321–324.
Google Scholar
Bird, S. and Klein, E. Phonological Events. Journal of Linguistics 26 (1990): 33–56.
Google Scholar
Boersma, P. Functional Phonology. LOT, Netherlands Graduate School of Linguistics, The Hague. 1998.
Google Scholar
Browman, C.P. and Goldstein, L. Articulatory gestures as phonological units. In: Phonology 6, Cambridge University Press, Cambridge, 1989: 201–251.
Google Scholar
Carson, J. Unification and Transduction in Computational Phonology. In: Proceedings of the 12th International Conference on Computational Linguistics, Budapest, 1, 1988: 106–111.
Google Scholar
Carson-Berndsen, J. Phonological Processing of Speech Variants. In: Proceedings of the 13th International Conference on Computational Linguistics (COLING-90) Helsinki, 3, 1990: 21–24.
Google Scholar
Carson-Berndsen, J. Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht, 1998.
Google Scholar
Carson-Berndsen, J. A Generic Lexicon Tool for Word Model Definition in Multimodal Applications. In: Proceedings of EUROSPEECH 99, 6th European Conference on Speech Communication and Technology, Budapest, September 1999: 2235–2238.
Google Scholar
Carson-Berndsen, J. Finite State Models, Event Logics and Statistics in Speech Recognition. In: G. Gazdar, K. Sparck Jones, and R. Needham(eds.): Computers, Language and Speech: Integrating formal theories and statistical data. Philosophical Transactions of the Royal Society, Series A, 358(1770), 2000: 1255–1266.
Google Scholar
Carson-Berndsen, J. Multilingual Time Maps: Portable Phonotactic Models for Speech Technology. In: Proceedings of the LREC Workshop on Portability Issues in Human Language Technology. Las Palmas, May 2002.
Google Scholar
Carson-Berndsen, J. and Joue, G. Cognitive constraints in a computational linguistic model for speech recognition. In: Proceedings of the 11th Irish Conference on Artificial Intelligence and Cognitive Science, Galway, Ireland, 2000.
Google Scholar
Carson-Berndsen, J. and Walsh, M. Generic techniques for multilingual speech technology applications. In: Proceedings of the 7th Conference on Automatic Natural Language Processing, Lausanne, Switzerland, 2000a: 61–70.
Google Scholar
Carson-Berndsen, J. and Walsh, M. Interpreting Multilinear Representations in Speech. In: Proceedings of the 8th Australian Conference on Speech Science and Technology, Canberra, Australia, 2000b: 472–477.
Google Scholar
Carson-Berndsen, J., Joue, G. and Walsh, M. Phonotactic Constraint Ranking for Speech Recognition. In: W. Daelemans, K. Sima’an, J. Veenstra, and J. Zavrel (eds.): Computational Linguistics in the Netherlands 2000, Editions Rodopi b.v. Amsterdam, New York: 2001: 16–29.
Google Scholar
Chang, S., Shastri, L. and Greenberg, S. Automatic Phonetic Transcription of Spontaneous Speech (American English). In: ICSLP-2000, Beijing, October 2000.
Google Scholar
Chang, S., Greenberg, S. and Wester, M. An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proceedings of Eurospeech 2001, Aalborg, 2001: 1725–1728.
Google Scholar
Church, K.W. Phonological Parsing in Speech Recognition. Kluwer Academic Publishers, Boston, 1987.
Google Scholar
Coleman, J.S. and Local, J.K. Monostratal Phonology and Speech Synthesis. In: P. Tench (ed.): Studies in Systemic Phonology. London, Pinter Publishers. 1992: 183–193.
Google Scholar
Deng, L. Speech Recognition Using Autosegmental Representation of Phonological Units with Interface to the Trended HMM. Free Speech Journal, 1997.
Google Scholar
Deng, L. A dynamic feature-based approach to the interface between phonology and phonetics for speech modelling and recognition. Speech Communication 24 (1998): 299–323.
Article Google Scholar
Goldsmith, J. Autosegmental Phonology. Indiana University Linguistics Club, Bloomington Indiana. 1976.
Google Scholar
Goldsmith, J. Autosegmental and Metrical Phonology. Basil Blackwell, Cambridge, MA, 1990.
Google Scholar
Greenberg, S. Speaking in shorthand-a syllable-centric perspective for understanding pronunciation variation. Speech Communication 29(2–4) (1999): 159–176.
Article Google Scholar
Jusek, A., Rautenstrauch, H., Fink, G.A., Kummert, F., Sagerer, G., Carson-Berndsen, J. and Gibbon, D. Detektion unbekannter Wörter mit Hilfe phonotaktischer Modelle. In: Mustererkennung 94, 16. DAGM-Symposium Wien Berlin, Springer Verlag, 1994: 238–245.
Google Scholar
Kent, R.D. and Read, C. The Acoustic Analysis of Speech. Whurr Publishers, 1992.
Google Scholar
Koreman, J., Andreeva, B. and Strik, H. Acoustic Parameters Versus Phonetic Features in ASR. In: Proceedings of ICPhS 99, 1999: 719–722.
Google Scholar
Pols, L. Flexible, Robust, and Efficient Human Speech Processing Versus Present-day Speech Technology. In: Proceedings of ICPhS 99, 1999: 9–16.
Google Scholar
Salomon, A. and Espy-Wilson, C. Automatic Detection of Manner Events based on Temporal Parameters. In: Proceedings of EUROSPEECH 99, 6th European Conference on Speech Communication and Technology, 1999: 2797–2800.
Google Scholar
Stevens, K.N. Acoustic Phonetics. The MIT Press, Cambridge MA, London, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

University College Dublin, Dublin
Julie Carson-Berndsen & Michael Walsh

Authors

Julie Carson-Berndsen
View author publications
You can also search for this author in PubMed Google Scholar
Michael Walsh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universität des Saarlandes, Saarbrücken, Germany
William J. Barry
Norwegian University of Science and Technology, Trondheim, Norway
Wim A. van Dommelen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Carson-Berndsen, J., Walsh, M. (2005). Phonetic Time Maps. In: Barry, W.J., van Dommelen, W.A. (eds) The Integration of Phonetic Knowledge in Speech Technology. Text, Speech and Language Technology, vol 25. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2637-4_4

Download citation

DOI: https://doi.org/10.1007/1-4020-2637-4_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2635-5
Online ISBN: 978-1-4020-2637-9
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)

Publish with us

Policies and ethics