Abstract
The teaching speech chosen to imitate plays a key role in learning Mandarin tone for L2 learners. It has been found that the synthesis teaching speech becomes more acceptable if it is alike the L2 learner’s own speech. Voice modification technology can be used to synthesize the teaching speech with both the standard speech of Chinese and the learner’s speech. At the same time different standard Chinese speakers will definitely affect the quality of the synthesis speech. The paper studies the selection method of the standard speech of Chinese in the teaching speech synthesis. The speakers’ features including MFCC, pitch, rhythm are compared and Gaussian Mixture Model is used to select the most appropriate Chinese speaker. The perceptual experimental results show that the modification with the Chinese speech which is similar to the learner’s speech in MFCC gets the best teaching speech both in phonetic and tonal quality.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tang M, Wang C, Seneff S (2001) Voice transformations: from: speech synthesis to mammalian vocalizations. Aalborg, Denmark Eurospeech 2001
Probst K, Ke Y, Eskenazi M (2002) Enhancing foreign language tutors-in search of the golden speaker. Speech Commun 37(3–4):161–173
Peabody M, Seneff S (2006) Towards automatic tone correction in nonnative mandarin. Chin Spoken Lang Process 2006:602–613
Felps D, Bortfeldb H, Gutierrez-Osuna R (2009) Foreign accent conversion in computer assisted pronunciation training. Speech Commun 51(10):920–932
Wang R, Lu J (2011) Investigation of golden speakers for second language learners from imitation preference perspective by voice modification. Speech Commun 53(2):175–184
Lin H, Wang Q (2007) Mandarin rhythm: an acoustic study. J Chin Linguist Comput 17(3):127–140
Ramus F, Nespor M, Mehler J (1999) Correlates of linguistic rhythm in the speech signal. Cognition 72:1–28
Grabe E, Low EL (2002) Durational variability in speech and the rhythm class hypothesis. In: Gussenhoven C, Warner N (eds) Laboratory phonology 7. Moutonde Gruyter, New York, pp 515–546
Cao W, Zhang J (2009) The establishment of a CAPL inter-chinese corpus and its labeling. In: Proceedings Of NCMMSC (in Chinese)
Cao W, Wang D, Zhang J, Xiong Z (2010) Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training. Int Speech 2010 1922–1925
Boersma P, Weenink D (2010) Praat: doing phonetics by computer. Version 5.1. 44
Acknowledgments
The research underlying this paper was supported by National Nature Science Foundation of China (61175019) and Youth Independent Research Program Projects of Beijing Language and Culture University (10JBT01).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xie, Y., Zhang, J., Shi, S. (2013). Standard Speaker Selection in Speech Synthesis for Mandarin Tone Learning. In: Lu, W., Cai, G., Liu, W., Xing, W. (eds) Proceedings of the 2012 International Conference on Information Technology and Software Engineering. Lecture Notes in Electrical Engineering, vol 212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34531-9_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-34531-9_39
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34530-2
Online ISBN: 978-3-642-34531-9
eBook Packages: EngineeringEngineering (R0)