Skip to main content

Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training

  • Chapter
Development of Multimodal Interfaces: Active Listening and Synchrony

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

This paper reviews interactive methods for improving the phonetic competence of subjects in the case of second language learning as well as in the case of speech therapy for subjects suffering from hearing-impairments or articulation disorders. As an example our audiovisual feedback software “SpeechTrainer” for improving the pronunciation quality of Standard German by visually highlighting acoustics-related and articulation-related sound features will be introduced here. Results from literature on training methods as well as the results concerning our own software indicate that audiovisual tools for phonetic and articulatory visualization are beneficial for computer-aided pronunciation training environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Flege, J.E.: Phonetic approximation in second language acquisition. Language Learning 30, 117–134 (1980)

    Article  Google Scholar 

  2. Munro, M.J., Derwing, T.M.: Foreign accent, comprehensibility, and intelligibility in the speech of second language learners. Language Learning 49, 285–310 (1999)

    Article  Google Scholar 

  3. Geers, A.E., Moog, J.S.: Predicting spoken language acquisition of profundly hearing impaired children. Journal of Speech and Hearing Disorders 52, 84–94 (1987)

    Article  Google Scholar 

  4. Strong, M. (ed.): Language Learning and Deafness. Cambridge University Press, Cambridge (1988)

    Google Scholar 

  5. Gibbon, F.E.: Undifferentiated lingual gestures in children with articulation/phonological disorders. Journal of Speech, Language, and Hearing Research 42, 382–397 (1999)

    Article  Google Scholar 

  6. Rvachew, S., Jamieson, D.G.: Perception of voiceless fricatives by children with a functional articulation disorder. Journal of Speech and Hearing Disorders 54, 193–208 (1989)

    Article  Google Scholar 

  7. Rvachew, S., Grawburg, M.: Correlates of phonological awareness in preschoolers with speech sound disorders. Journal of Speech, Language, and Hearing Research 49, 74–87 (2006)

    Article  Google Scholar 

  8. Kent, R.D.: Research on speech motor control and its disorders: A review and prospective. Journal of Communication Disorders 33, 391–428 (2000)

    Article  Google Scholar 

  9. Kröger, B.J., Kannampuzha, J., Neuschaefer-Rube, C.: Towards a neurocomputational model of speech production and perception. Speech Communication 51, 793–809 (2009)

    Article  Google Scholar 

  10. Perez-Pereira, M., Castro, J.: Language acquisition and the compensation of visual deficit: New comparative data on a controversial topic. British Journal of Developmental Psychology 15, 439–459 (1997)

    Article  Google Scholar 

  11. Yoshinaga-Itano, C., Sedey, A.: Early Speech Development in Children Who Are Deaf or Hard of Hearing: Interrelationships with Language and Hearing. Volta Review 100, 181–211 (1999)

    Google Scholar 

  12. Accent School (2008), http://www.accentschool.com/

  13. Pronunciation Power (2006), http://www.englishlearning.com/

  14. Jokisch, O., Koloska, U., Hirschfeld, D., Hoffmann, R.: Pronunciation learning and foreign accent reduction by an audiovisual feedback system. In: Tao, J., Tan, T., Picard, R.W. (eds.) ACII 2005. LNCS, vol. 3784, pp. 419–425. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Harrison, A.M., Lau, W.Y., Meng, H., Wang, L.: Improving mispronunciation detection and diagnosis of learners’ speech with context-sensitive phonological rules based on language transfer. In: Proceedings of Interspeech, Brisbane, Australia, pp. 2787–2790 (2008)

    Google Scholar 

  16. Meng, H., Lo, Y., Wang, L., Lau, W.: Deriving salient learners’ mispronunciations form cross-language phonological comparisons. In: Proceedings of the IEEE Workshop in Automatic Speech Recognition and Understanding, ASRU, Kyoto, Japan, pp. 437–442 (2007)

    Google Scholar 

  17. Wang, L.X., Feng, X., Meng, H.: Mispronunciation detection based on cross-language phonological comparisons. In: Proceedings of the IEEE IET International Conference on Audio, Language and Image Processing, Shanghai, China, pp. 307–311 (2008)

    Google Scholar 

  18. Better Accent Tutor (2009), http://www.betteraccent.com/

  19. Vicsi, K., Csatari, F., Bakcsi, Z.s., Tantos, A.: Distance score evaluation of the visualised speech spectra at audio-visual articulation training. In: Proceedings of EUROSPEECH 1999, Budapest, Hungary, pp. 1911–1914 (1999)

    Google Scholar 

  20. Vicsi, K., Hacki, T.: CoKo - Computergestützter Sprechkorrektor mit audiovisueller Selbstkontrolle für artikulationsgestörte und hörbehinderte Kinder. Sprache-Stimme-Gehör 20, 141–149 (1996)

    Google Scholar 

  21. Öster, A.M.: Teaching speech skills to deaf children by computer-based speech training. STL-Quarterly Progress and Status Report 36(4), 67–75 (1995)

    Google Scholar 

  22. Badin, P., Bailly, G., Boë, L.J.: Towards the Use of a Virtual Talking Head and of Speech Mapping tools for pronunciation training. In: Proceedings of the ESCA Tutorial and Research Workshop on Speech Technology in Language Learning (STiLL 1998), pp. 167–170 (1998)

    Google Scholar 

  23. Badin, P., Tarabalka, Y., Elisei, F., Bailly, G.: Can you “read tongue movements”? In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2635–2638 (2008)

    Google Scholar 

  24. Bailly, G., Bérar, M., Elisei, F., Odisio, M.: Audiovisual speech synthesis. International Journal of Speech Technology 6, 331–346 (2003)

    Article  Google Scholar 

  25. Engwall, O., Bälter, O., Öster, A.M., Kjellström, H.: Designing the user interface of the computer-based speech training system ARTUR based on early user tests. Journal of Behaviour and Information Technology 25, 353–365 (2006)

    Article  Google Scholar 

  26. Engwall, O., Bälter, O.: Pronunciation feedback from real and virtual language teachers. Journal of Computer Assisted Language Learning 20, 235–262 (2007)

    Article  Google Scholar 

  27. Kröger, B.J., Hoole, P., Sader, R., Geng, C., Pompino-Marschall, B., Neuschaefer-Rube, C.: MRT-Sequenzen als Datenbasis eines visuellen Artikulationsmodells. HNO 52, 837–843 (2004)

    Article  Google Scholar 

  28. Kröger, B.J., Birkholz, P.: A gesture-based concept for speech movement control in articulatory speech synthesis. In: Esposito, A., Faundez-Zanuy, M., Keller, E., Marinaro, M. (eds.) COST Action 2102. LNCS (LNAI), vol. 4775, pp. 174–189. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  29. Kröger, B.J., Gotto, J., Albert, S., Neuschaefer-Rube, C.: A visual articulatory model and its application to therapy of speech disorders: a pilot study. In: Fuchs, S., Perrier, P., Pompino-Marschall, B. (Hrsg.) Speech production and perception: Experimental analyses and models. ZAS Papers in Linguistics, vol. 40, pp. 79–94 (2005)

    Google Scholar 

  30. Kröger, B.J., Graf-Bortscheller, V., Lowit, A.: Two- and three-dimensional visual articulatory models for pronunciation training and for treatment of speech disorders. In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2639–2642 (2008)

    Google Scholar 

  31. Massaro, D.W.: Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. MIT Press, Cambridge (1998)

    Google Scholar 

  32. Massaro, D.W.: A computer-animated tutor for spoken and written language learning. In: Proceedings of the 5th International Conference on Multimodal Interfaces, Vancouver, British Columbia, Canada, pp. 172–175 (2003)

    Google Scholar 

  33. Massaro, D.W.: The psychology and technology of talking heads: Applications in language learning. In: van Kuppevelt, J.C.J., Dybkjær, L., Bernsen, N.O. (eds.) Advances in Natural Multimodal Dialogue Systems, vol. 30, pp. 183–214. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  34. Massaro, D.W., Liu, Y., Chen, T.H., Perfetti, C.: A multilingual embodied conversational agent for tutoring speech and language learning. In: Proceedings of Interspeech 2006, Pittsburgh, PA, USA, pp. 825–828 (2006)

    Google Scholar 

  35. Massaro, D.W., Bigler, S., Chen, T., Perlman, M., Ouni, S.: Pronunciation training: the role of eye and ear. In: Proceedings of Interspeech 2008, Brisbane, Queensland, Australia, pp. 2623–2626 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kröger, B.J., Birkholz, P., Hoffmann, R., Meng, H. (2010). Audiovisual Tools for Phonetic and Articulatory Visualization in Computer-Aided Pronunciation Training. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics