Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training

Kröger, Bernd J.; Birkholz, Peter; Hoffmann, Rüdiger; Meng, Helen

doi:10.1007/978-3-642-12397-9_29

Bernd J. Kröger²⁰,
Peter Birkholz²⁰,
Rüdiger Hoffmann²¹ &
…
Helen Meng²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

2435 Accesses
8 Citations

Abstract

This paper reviews interactive methods for improving the phonetic competence of subjects in the case of second language learning as well as in the case of speech therapy for subjects suffering from hearing-impairments or articulation disorders. As an example our audiovisual feedback software “SpeechTrainer” for improving the pronunciation quality of Standard German by visually highlighting acoustics-related and articulation-related sound features will be introduced here. Results from literature on training methods as well as the results concerning our own software indicate that audiovisual tools for phonetic and articulatory visualization are beneficial for computer-aided pronunciation training environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Software-Assisted Japanese Phonetic Teaching Based on Phonetic Visualization

Multisensory Pronunciation Training in a Video Conference-Based Foreign Language Classroom

English Pronunciation Correction Service for Hearing-Impaired People: BETTer, Focusing on the Personalized Speech Model

References

Flege, J.E.: Phonetic approximation in second language acquisition. Language Learning 30, 117–134 (1980)
Article Google Scholar
Munro, M.J., Derwing, T.M.: Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning 49, 285–310 (1999)
Article Google Scholar
Geers, A.E., Moog, J.S.: Predicting spoken language acquisition of profundly hearing impaired children. Journal of Speech and Hearing Disorders 52, 84–94 (1987)
Article Google Scholar
Strong, M. (ed.): Language Learning and Deafness. Cambridge University Press, Cambridge (1988)
Google Scholar
Gibbon, F.E.: Undifferentiated lingual gestures in children with articulation/phonological disorders. Journal of Speech, Language, and Hearing Research 42, 382–397 (1999)
Article Google Scholar
Rvachew, S., Jamieson, D.G.: Perception of voiceless fricatives by children with a functional articulation disorder. Journal of Speech and Hearing Disorders 54, 193–208 (1989)
Article Google Scholar
Rvachew, S., Grawburg, M.: Correlates of phonological awareness in preschoolers with speech sound disorders. Journal of Speech, Language, and Hearing Research 49, 74–87 (2006)
Article Google Scholar
Kent, R.D.: Research on speech motor control and its disorders: A review and prospective. Journal of Communication Disorders 33, 391–428 (2000)
Article Google Scholar
Kröger, B.J., Kannampuzha, J., Neuschaefer-Rube, C.: Towards a neurocomputational model of speech production and perception. Speech Communication 51, 793–809 (2009)
Article Google Scholar
Perez-Pereira, M., Castro, J.: Language acquisition and the compensation of visual deficit: New comparative data on a controversial topic. British Journal of Developmental Psychology 15, 439–459 (1997)
Article Google Scholar
Yoshinaga-Itano, C., Sedey, A.: Early Speech Development in Children Who Are Deaf or Hard of Hearing: Interrelationships with Language and Hearing. Volta Review 100, 181–211 (1999)
Google Scholar
Accent School (2008), http://www.accentschool.com/
Pronunciation Power (2006), http://www.englishlearning.com/
Jokisch, O., Koloska, U., Hirschfeld, D., Hoffmann, R.: Pronunciation learning and foreign accent reduction by an audiovisual feedback system. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 419–425. Springer, Heidelberg (2005)
Chapter Google Scholar
Harrison, A.M., Lau, W.Y., Meng, H., Wang, L.: Improving mispronunciation detection and diagnosis of learners’ speech with context-sensitive phonological rules based on language transfer. In: Proceedings of Interspeech, Brisbane, Australia, pp. 2787–2790 (2008)
Google Scholar
Meng, H., Lo, Y., Wang, L., Lau, W.: Deriving salient learners’ mispronunciations form cross-language phonological comparisons. In: Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, ASRU, Kyoto, Japan, pp. 437–442 (2007)
Google Scholar
Wang, L.X., Feng, X., Meng, H.: Mispronunciation detection based on cross-language phonological comparisons. In: Proceedings of the IEEE IET International Conference on Audio, Language and Image Processing, Shanghai, China, pp. 307–311 (2008)
Google Scholar
Better Accent Tutor (2009), http://www.betteraccent.com/
Vicsi, K., Csatari, F., Bakcsi, Z.s., Tantos, A.: Distance score evaluation of the visualised speech spectra at audio-visual articulation training. In: Proceedings of EUROSPEECH 1999, Budapest, Hungary, pp. 1911–1914 (1999)
Google Scholar
Vicsi, K., Hacki, T.: CoKo - Computergestützter Sprechkorrektor mit audiovisueller Selbstkontrolle für artikulationsgestörte und hörbehinderte Kinder. Sprache-Stimme-Gehör 20, 141–149 (1996)
Google Scholar
Öster, A.M.: Teaching speech skills to deaf children by computer-based speech training. STL-Quarterly Progress and Status Report 36(4), 67–75 (1995)
Google Scholar
Badin, P., Bailly, G., Boë, L.J.: Towards the Use of a Virtual Talking Head and of Speech Mapping tools for pronunciation training. In: Proceedings of the ESCA Tutorial and Research Workshop on Speech Technology in Language Learning (STiLL 1998), pp. 167–170 (1998)
Google Scholar
Badin, P., Tarabalka, Y., Elisei, F., Bailly, G.: Can you “read tongue movements”? In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2635–2638 (2008)
Google Scholar
Bailly, G., Bérar, M., Elisei, F., Odisio, M.: Audiovisual speech synthesis. International Journal of Speech Technology 6, 331–346 (2003)
Article Google Scholar
Engwall, O., Bälter, O., Öster, A.M., Kjellström, H.: Designing the user interface of the computer-based speech training system ARTUR based on early user tests. Journal of Behaviour and Information Technology 25, 353–365 (2006)
Article Google Scholar
Engwall, O., Bälter, O.: Pronunciation feedback from real and virtual language teachers. Journal of Computer Assisted Language Learning 20, 235–262 (2007)
Article Google Scholar
Kröger, B.J., Hoole, P., Sader, R., Geng, C., Pompino-Marschall, B., Neuschaefer-Rube, C.: MRT-Sequenzen als Datenbasis eines visuellen Artikulationsmodells. HNO 52, 837–843 (2004)
Article Google Scholar
Kröger, B.J., Birkholz, P.: A gesture-based concept for speech movement control in articulatory speech synthesis. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 174–189. Springer, Heidelberg (2007)
Chapter Google Scholar
Kröger, B.J., Gotto, J., Albert, S., Neuschaefer-Rube, C.: A visual articulatory model and its application to therapy of speech disorders: a pilot study. In: Fuchs, S., Perrier, P., Pompino-Marschall, B. (Hrsg.) Speech production and perception: Experimental analyses and models. ZAS Papers in Linguistics, vol. 40, pp. 79–94 (2005)
Google Scholar
Kröger, B.J., Graf-Bortscheller, V., Lowit, A.: Two- and three-dimensional visual articulatory models for pronunciation training and for treatment of speech disorders. In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2639–2642 (2008)
Google Scholar
Massaro, D.W.: Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. MIT Press, Cambridge (1998)
Google Scholar
Massaro, D.W.: A computer-animated tutor for spoken and written language learning. In: Proceedings of the 5th International Conference on Multimodal Interfaces, Vancouver, British Columbia, Canada, pp. 172–175 (2003)
Google Scholar
Massaro, D.W.: The psychology and technology of talking heads: Applications in language learning. In: van Kuppevelt, J.C.J., Dybkjær, L., Bernsen, N.O. (eds.) Advances in Natural Multimodal Dialogue Systems, vol. 30, pp. 183–214. Springer, Heidelberg (2005)
Chapter Google Scholar
Massaro, D.W., Liu, Y., Chen, T.H., Perfetti, C.: A multilingual embodied conversational agent for tutoring speech and language learning. In: Proceedings of Interspeech 2006, Pittsburgh, PA, USA, pp. 825–828 (2006)
Google Scholar
Massaro, D.W., Bigler, S., Chen, T., Perlman, M., Ouni, S.: Pronunciation training: the role of eye and ear. In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2623–2626 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Phoniatrics, Pedaudiology, and Communication Disorders, University Hospital Aachen and RWTH Aachen University, Aachen, Germany
Bernd J. Kröger & Peter Birkholz
Department of Acoustics and Speech Communication, Dresden University of Technology, Dresden, Germany
Rüdiger Hoffmann
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong (CUHK), Shatin, NT, Hong Kong SAR, China
Helen Meng

Authors

Bernd J. Kröger
View author publications
You can also search for this author in PubMed Google Scholar
Peter Birkholz
View author publications
You can also search for this author in PubMed Google Scholar
Rüdiger Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Helen Meng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kröger, B.J., Birkholz, P., Hoffmann, R., Meng, H. (2010). Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training

Abstract

Access this chapter

Preview

Similar content being viewed by others

Software-Assisted Japanese Phonetic Teaching Based on Phonetic Visualization

Multisensory Pronunciation Training in a Video Conference-Based Foreign Language Classroom

English Pronunciation Correction Service for Hearing-Impaired People: BETTer, Focusing on the Personalized Speech Model

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training

Abstract

Access this chapter

Preview

Similar content being viewed by others

Software-Assisted Japanese Phonetic Teaching Based on Phonetic Visualization

Multisensory Pronunciation Training in a Video Conference-Based Foreign Language Classroom

English Pronunciation Correction Service for Hearing-Impaired People: BETTer, Focusing on the Personalized Speech Model

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation