Advertisement

Towards Automatic Tone Correction in Non-native Mandarin

  • Mitchell Peabody
  • Stephanie Seneff
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4274)

Abstract

Feedback is an important part of foreign language learning and Computer Aided Language Learning (CALL) systems. For pronunciation tutoring, one method to provide feedback is to provide examples of correct speech for the student to imitate. However, this may be frustrating if a student is unable to completely match the example speech. This research advances towards providing feedback using a student’s own voice. Using the case of an American learning Mandarin Chinese, the differences between native and non-native pronunciations of Mandarin tone are highlighted, and a method for correcting tone errors is presented, which uses pitch transformation techniques to alter student tone productions while maintaining other voice characteristics.

Keywords

Pitch Contour Speech Technology Tonal Language Tone Contour Lexical Tone 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Horwitz, E.K., Horwitz, M.B., Cope, J.: Foreign language classroom anxiety. The Modern Language Journal 70(2), 125–132 (1986)CrossRefGoogle Scholar
  2. 2.
    Onwuegbuzie, A.J., Bailey, P., Daley, C.E.: Factors associated with foreign language anxiety. Applied Psycholinguistics 20, 217–239 (1999)CrossRefGoogle Scholar
  3. 3.
    Kiriloff, C.: On the auditory discrimination of tones in mandarin. Phonetica 20, 63–67 (1969)CrossRefGoogle Scholar
  4. 4.
    Leather, J.: Perceptual and productive learning of chinese lexical tone by dutch and english speakers. In: Leather, J., James, A. (eds.) New Sounds 90, University of Amsterdam, pp. 72–97 (1990)Google Scholar
  5. 5.
    Skehan, P.: Task-based instruction. Language Teaching 36(01), 1–14 (2003)CrossRefGoogle Scholar
  6. 6.
    Ellis, R.: Task-based language learning and teaching. Oxford University Press, Oxford (2003)Google Scholar
  7. 7.
    Johnson, L., Marsella, S., Mote, N., Viljhálmsson, H., Narayanan, S., Choi, S.: Tactical language training system: Supporting the rapid acquisition of foreign language and cultural skills. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)Google Scholar
  8. 8.
    Johnson, L., Beal, C.R., Fowles-Winkler, A., Lauper, U., Marsella, S., Narayanan, S., Papachristou, D., Vilhjálmsson, H.: Tactical language training system: An interim report. In: Intelligent Tutoring Systems, pp. 336–345 (2004)Google Scholar
  9. 9.
    Epic Games, I.: Unreal tournament 2003 (2003), http://www.unrealtournament.com/
  10. 10.
    Young, S., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University, Cambridge (1997)Google Scholar
  11. 11.
    Mote, N., Johnson, L., Sethy, A., Silva, J., Narayanan, S.: Tactical language detection and modeling of learner speech errors: The case of arabic tactical language training for american english speakers. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)Google Scholar
  12. 12.
    Raux, A., Eskenazi, M.: Using task-oriented spoken dialogue systems for language learning: Potential, practical applications and challenges. In: Proc. of In- STIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)Google Scholar
  13. 13.
    Raux, A., Langner, B., Eskenazi, M., Black, A.: Let’s go: Improving spoken dialog systems for the elderly and non-natives. In: Eurospeech 2003, Geneva, Switzerland (2003) Google Scholar
  14. 14.
    Raux, A., Eskenazi, M.: Non-native users in the let’s go!! spoken dialogue system: Dealing with linguistic mismatch. In: HLT/NAACL 2004, Boston, MA (2004)Google Scholar
  15. 15.
    Bohus, D., Rudnicky, A.: Ravenclaw: Dialog management using hierarchical task decomposition and an expectation agenda. In: Eurospeech 2003, Geneva, Switzerland (2003)Google Scholar
  16. 16.
    Raux, A., Black, A.: A unit selection approach to f0 modeling and its application to emphasis. In: ASRU 2003, St Thomas, US Virgin Islands (2003)Google Scholar
  17. 17.
    Seneff, S., Wang, C., Peabody, M., Zue, V.: Second language acquisition through human computer dialogue. In: Proceedings of ISCSLP (2004)Google Scholar
  18. 18.
    Lau, T.L.J.: Slls: An online conversational spoken language learning system. Master’s thesis, Massachusetts Institute of Technology (2003)Google Scholar
  19. 19.
    Lee, V.: Languageland: A multimodal conversational spoken language learning system. Master’s thesis, Massachusetts Institute of Technology, MEng (2004)Google Scholar
  20. 20.
    Neri, A., Cucchiarini, C., Strik, H.: Feedback in computer assisted pronunciation training: technology push or demand pull? In: Proceedings of ICSLP, Denver, USA, pp. 1209–1212 (2002)Google Scholar
  21. 21.
    Vardanian, R.M.: Teaching english through oscilloscope displays. Languate Learning 3(4), 109–118 (1964)CrossRefGoogle Scholar
  22. 22.
    Álvarez, A., Martínez, R., Gómez, P., Domínguez, J.L.: A signal processing technique for speech visualization. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH (1998)Google Scholar
  23. 23.
    Martin, P.: Winpitch ltl ii, a multimodel pronunciation software. In: Proc. Of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)Google Scholar
  24. 24.
    Sundström, A.: Automatic prosody modification as a means for foreign language pronunciation training. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH, pp. 49–52 (1998)Google Scholar
  25. 25.
    Hamon, C., Moulines, E., Charpentier, F.: A diphone synthesis system based on time-domain prosodic modifications of speech. In: Proc. ICASSP 1989, Glasgow, Scotland, pp. 238–241 (1989)Google Scholar
  26. 26.
    Moulines, E., Charpentier, F.: Pitch synchronouswaveform processing techniques for text-to-speech conversion using diphones. Speech Communication 9, 453–467 (1990)CrossRefGoogle Scholar
  27. 27.
    Moulines, E., Laroche, J.: Non-parametric techniques for pitch scaling and timescale modification of speech. Speech Communication 16(2), 175–207 (1995)CrossRefGoogle Scholar
  28. 28.
    Carlson, R., Granström, B., Hunnicutt, S.: Multilingual text-to-speech development and applications. In: Ainsworth, A. (ed.) Advances in speech, hearing and language processing, pp. 269–296. JAI Press, London (1990)Google Scholar
  29. 29.
    Black, A.W., Hunt, A.J.: Generating f0 contours from tobi labels using linear regression. In: Proceedings of the Fourth International Conference on Spoken Language Processing, vol. 3, pp. 1385–1388 (1996)Google Scholar
  30. 30.
    Silverman, K.E.A., Beckman, M., Pitrelli, J.F., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., Hirschberg, J.: Tobi: A standard for labeling English prosody. In: Proceedings of the 1992 International Conference on Spoken Language Processing, Banff, Canada, vol. 2, pp. 867–870 (1992)Google Scholar
  31. 31.
    Jilka, M., Möhler, G.: Intonational foreign accent: Speech technology and foreign language testing. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH, pp. 115–118 (1998)Google Scholar
  32. 32.
    Wang, C., Glass, J.R., Meng, H., Polifroni, J., Seneff, S., Zue, V.: Yinhe: A Mandarin Chinese version of the galaxy system. In: Proc. EUROSPEECH 1997, Rhodes, Greece, pp. 351–354 (1997)Google Scholar
  33. 33.
    Peabody, M., Seneff, S., Wang, C.: Mandarin tone acquisition through typed interactions. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)Google Scholar
  34. 34.
    Duanmu, S.: The Phonology of Standard Chinese. Oxford University Press, Oxford (2002)Google Scholar
  35. 35.
    Whalen, D., Xu, Y.: Information for mandarin tones in the amplitude contour and in brief segments. Phonetica 49, 25–47 (1992)CrossRefGoogle Scholar
  36. 36.
    Xu, Y.: Contextual tonalvariations inmandarin. JournalofPhonetics 25, 61–83 (1997)Google Scholar
  37. 37.
    Shih, C.: Declination in mandarin. Prosody tutorial at 7th International Conference on Spoken Language Processiong (2002)Google Scholar
  38. 38.
    Chen, M.: Tone Sandhi: Patterns Across Chinese Dialects. Cambridge University Press, Cambridge (2000)CrossRefGoogle Scholar
  39. 39.
    Jilka, M.: The contribution of intonation to the perception of foreign accent. PhD thesis, University of Stuttgart (2000)Google Scholar
  40. 40.
    Wang, C., Seneff, S.: Robust pitch tracking for prosodic modeling in telephone speech. In: Proc. ICASSP, Istanbul, Turkey, pp. 887–890 (2000)Google Scholar
  41. 41.
    Wang, C.: Prosodic Modeling for Improved Speech Recognition and Understanding. PhD thesis, Massachusetts Institute of Technology (2001)Google Scholar
  42. 42.
    Seneff, S.: System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction. IEEE Trans. Acoustics, Speech and Signal Processing ASSP 30(4), 566 (1982)CrossRefGoogle Scholar
  43. 43.
    Tang, M., Wang, C., Seneff, S.: Voice transformations: From speech synthesis to mammalian vocalizations. In: Eurospeech 2001, Aalborg, Denmark (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mitchell Peabody
    • 1
  • Stephanie Seneff
    • 1
  1. 1.Computer Science and Artificial Intelligence LaboratoryMITCambridgeUSA

Personalised recommendations