Towards Automatic Tone Correction in Non-native Mandarin

Peabody, Mitchell; Seneff, Stephanie

doi:10.1007/11939993_62

Mitchell Peabody²² &
Stephanie Seneff²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1605 Accesses
11 Citations

Abstract

Feedback is an important part of foreign language learning and Computer Aided Language Learning (CALL) systems. For pronunciation tutoring, one method to provide feedback is to provide examples of correct speech for the student to imitate. However, this may be frustrating if a student is unable to completely match the example speech. This research advances towards providing feedback using a student’s own voice. Using the case of an American learning Mandarin Chinese, the differences between native and non-native pronunciations of Mandarin tone are highlighted, and a method for correcting tone errors is presented, which uses pitch transformation techniques to alter student tone productions while maintaining other voice characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Horwitz, E.K., Horwitz, M.B., Cope, J.: Foreign language classroom anxiety. The Modern Language Journal 70(2), 125–132 (1986)
Article Google Scholar
Onwuegbuzie, A.J., Bailey, P., Daley, C.E.: Factors associated with foreign language anxiety. Applied Psycholinguistics 20, 217–239 (1999)
Article Google Scholar
Kiriloff, C.: On the auditory discrimination of tones in mandarin. Phonetica 20, 63–67 (1969)
Article Google Scholar
Leather, J.: Perceptual and productive learning of chinese lexical tone by dutch and english speakers. In: Leather, J., James, A. (eds.) New Sounds 90, University of Amsterdam, pp. 72–97 (1990)
Google Scholar
Skehan, P.: Task-based instruction. Language Teaching 36(01), 1–14 (2003)
Article Google Scholar
Ellis, R.: Task-based language learning and teaching. Oxford University Press, Oxford (2003)
Google Scholar
Johnson, L., Marsella, S., Mote, N., Viljhálmsson, H., Narayanan, S., Choi, S.: Tactical language training system: Supporting the rapid acquisition of foreign language and cultural skills. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)
Google Scholar
Johnson, L., Beal, C.R., Fowles-Winkler, A., Lauper, U., Marsella, S., Narayanan, S., Papachristou, D., Vilhjálmsson, H.: Tactical language training system: An interim report. In: Intelligent Tutoring Systems, pp. 336–345 (2004)
Google Scholar
Epic Games, I.: Unreal tournament 2003 (2003), http://www.unrealtournament.com/
Young, S., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University, Cambridge (1997)
Google Scholar
Mote, N., Johnson, L., Sethy, A., Silva, J., Narayanan, S.: Tactical language detection and modeling of learner speech errors: The case of arabic tactical language training for american english speakers. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)
Google Scholar
Raux, A., Eskenazi, M.: Using task-oriented spoken dialogue systems for language learning: Potential, practical applications and challenges. In: Proc. of In- STIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)
Google Scholar
Raux, A., Langner, B., Eskenazi, M., Black, A.: Let’s go: Improving spoken dialog systems for the elderly and non-natives. In: Eurospeech 2003, Geneva, Switzerland (2003)
Google Scholar
Raux, A., Eskenazi, M.: Non-native users in the let’s go!! spoken dialogue system: Dealing with linguistic mismatch. In: HLT/NAACL 2004, Boston, MA (2004)
Google Scholar
Bohus, D., Rudnicky, A.: Ravenclaw: Dialog management using hierarchical task decomposition and an expectation agenda. In: Eurospeech 2003, Geneva, Switzerland (2003)
Google Scholar
Raux, A., Black, A.: A unit selection approach to f0 modeling and its application to emphasis. In: ASRU 2003, St Thomas, US Virgin Islands (2003)
Google Scholar
Seneff, S., Wang, C., Peabody, M., Zue, V.: Second language acquisition through human computer dialogue. In: Proceedings of ISCSLP (2004)
Google Scholar
Lau, T.L.J.: Slls: An online conversational spoken language learning system. Master’s thesis, Massachusetts Institute of Technology (2003)
Google Scholar
Lee, V.: Languageland: A multimodal conversational spoken language learning system. Master’s thesis, Massachusetts Institute of Technology, MEng (2004)
Google Scholar
Neri, A., Cucchiarini, C., Strik, H.: Feedback in computer assisted pronunciation training: technology push or demand pull? In: Proceedings of ICSLP, Denver, USA, pp. 1209–1212 (2002)
Google Scholar
Vardanian, R.M.: Teaching english through oscilloscope displays. Languate Learning 3(4), 109–118 (1964)
Article Google Scholar
Álvarez, A., Martínez, R., Gómez, P., Domínguez, J.L.: A signal processing technique for speech visualization. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH (1998)
Google Scholar
Martin, P.: Winpitch ltl ii, a multimodel pronunciation software. In: Proc. Of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)
Google Scholar
Sundström, A.: Automatic prosody modification as a means for foreign language pronunciation training. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH, pp. 49–52 (1998)
Google Scholar
Hamon, C., Moulines, E., Charpentier, F.: A diphone synthesis system based on time-domain prosodic modifications of speech. In: Proc. ICASSP 1989, Glasgow, Scotland, pp. 238–241 (1989)
Google Scholar
Moulines, E., Charpentier, F.: Pitch synchronouswaveform processing techniques for text-to-speech conversion using diphones. Speech Communication 9, 453–467 (1990)
Article Google Scholar
Moulines, E., Laroche, J.: Non-parametric techniques for pitch scaling and timescale modification of speech. Speech Communication 16(2), 175–207 (1995)
Article Google Scholar
Carlson, R., Granström, B., Hunnicutt, S.: Multilingual text-to-speech development and applications. In: Ainsworth, A. (ed.) Advances in speech, hearing and language processing, pp. 269–296. JAI Press, London (1990)
Google Scholar
Black, A.W., Hunt, A.J.: Generating f0 contours from tobi labels using linear regression. In: Proceedings of the Fourth International Conference on Spoken Language Processing, vol. 3, pp. 1385–1388 (1996)
Google Scholar
Silverman, K.E.A., Beckman, M., Pitrelli, J.F., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., Hirschberg, J.: Tobi: A standard for labeling English prosody. In: Proceedings of the 1992 International Conference on Spoken Language Processing, Banff, Canada, vol. 2, pp. 867–870 (1992)
Google Scholar
Jilka, M., Möhler, G.: Intonational foreign accent: Speech technology and foreign language testing. In: STILL, ESCA, ESCA and Department of Speech, Music and Hearing KTH, pp. 115–118 (1998)
Google Scholar
Wang, C., Glass, J.R., Meng, H., Polifroni, J., Seneff, S., Zue, V.: Yinhe: A Mandarin Chinese version of the galaxy system. In: Proc. EUROSPEECH 1997, Rhodes, Greece, pp. 351–354 (1997)
Google Scholar
Peabody, M., Seneff, S., Wang, C.: Mandarin tone acquisition through typed interactions. In: Proc. of InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (2004)
Google Scholar
Duanmu, S.: The Phonology of Standard Chinese. Oxford University Press, Oxford (2002)
Google Scholar
Whalen, D., Xu, Y.: Information for mandarin tones in the amplitude contour and in brief segments. Phonetica 49, 25–47 (1992)
Article Google Scholar
Xu, Y.: Contextual tonalvariations inmandarin. JournalofPhonetics 25, 61–83 (1997)
Google Scholar
Shih, C.: Declination in mandarin. Prosody tutorial at 7th International Conference on Spoken Language Processiong (2002)
Google Scholar
Chen, M.: Tone Sandhi: Patterns Across Chinese Dialects. Cambridge University Press, Cambridge (2000)
Book Google Scholar
Jilka, M.: The contribution of intonation to the perception of foreign accent. PhD thesis, University of Stuttgart (2000)
Google Scholar
Wang, C., Seneff, S.: Robust pitch tracking for prosodic modeling in telephone speech. In: Proc. ICASSP, Istanbul, Turkey, pp. 887–890 (2000)
Google Scholar
Wang, C.: Prosodic Modeling for Improved Speech Recognition and Understanding. PhD thesis, Massachusetts Institute of Technology (2001)
Google Scholar
Seneff, S.: System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction. IEEE Trans. Acoustics, Speech and Signal Processing ASSP 30(4), 566 (1982)
Article Google Scholar
Tang, M., Wang, C., Seneff, S.: Voice transformations: From speech synthesis to mammalian vocalizations. In: Eurospeech 2001, Aalborg, Denmark (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA
Mitchell Peabody & Stephanie Seneff

Authors

Mitchell Peabody
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie Seneff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peabody, M., Seneff, S. (2006). Towards Automatic Tone Correction in Non-native Mandarin. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_62

Download citation

DOI: https://doi.org/10.1007/11939993_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics